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Abstract 



We study the statistical physical properties of (discretized) "random surfaces," which are 
random functions from Z d (or large subsets of Z d ) to E, where E is Z or HL Their laws are 
determined by convex, nearest-neighbor, gradient Gibbs potentials that are invariant under 
translation by a full-rank sublattice £ of Z d ; they include many discrete and continuous 
height function models (e.g., domino tilings, square ice, the harmonic crystal, the Ginzburg- 
Landau V0 interface model, the linear solid-on-solid model) as special cases. 

We prove a variational principle — characterizing gradient phases of a given slope as min- 
imizers of the specific free energy — and an empirical measure large deviations principle (with 
a unique rate function minimizer) for random surfaces on mesh approximations of bounded 
domains. We also prove that the surface tension is strictly convex and that if u is in the inte- 
rior of the space of finite-surface-tension slopes, then there exists a minimal energy gradient 
phase n u of slope u. 

Using a new geometric technique called cluster swapping (a variant of the Swendsen- 
Wang update for Fortuin-Kasteleyn clusters), we show that fi u is unique if at least one of 
the following holds: E — R, d e {1, 2}, there exists a rough gradient phase of slope u, or u 
is irrational. When d = 2 and E = Z, we show that the slopes of all smooth phases (a.k.a. 
crystal facets) lie in the dual lattice of £. 

In the case E — Z and d = 2, our results resolve and greatly generalize a number of 
conjectures of Cohn, Elkies, and Propp — one of which is that there is a unique ergodic Gibbs 
measure on domino tilings for each non-extremal slope. We also prove several theorems cited 
by Kenyon, Okounkov, and Sheffield in their recent exact solution of the dimer model on 
general planar lattices. In the case E — R, our results generalize and extend many of the 
results in the literature on Ginzurg-Landau V0-interface models. 
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Chapter 1 
Introduction 



The following is a fundamental problem of variational calculus: given a bounded open subset 
D of M. d and a free energy function o : D x M d x M dxm i— > R, find the differentiate function 
/ : D \— > M m that (possibly subject to boundary conditions) minimizes the free energy 
integral: 



Since the seventeenth century, these free-energy-minimizing functions have been popular 
models for determining (among other things) the shapes assumed by solid objects in the pres- 
ence of outside forces: ropes suspended between poles, elastic sheets stretched to boundary 
conditions, and twisted or otherwise strained three-dimensional solids. They are also useful 
in modeling surfaces of water droplets and other phase interfaces. Rigorous formulations and 
solutions to these problems rank among the great achievements of classical analysis (includ- 
ing work by Fermat, Newton, Leibniz, the Bernoullis, Euler, Lagrange, Legendre, Jacobi, 
Hamilton, Weierstrass, etc. [.50, ) and play important roles in physics and engineering. 

All of these models assume that matter is continuous and distributes force in a continuous 
way. One of the goals of statistical physics has become not merely to solve variational 
problems but to understand and, in some sense, to justify them in light of the fact that 
matter is comprised of individual, randomly behaving atoms. To this end, one begins by 
postulating a simple form for the local particle interactions: one approach — the one we will 
study in this work — is to represent the "atoms" of the solid crystal by points in a subset A 
of Z d , each of which has a "spatial position" given by a function : A i— > M. m , and to specify 
the interaction between the particles by a Gibbs potential $ that possesses certain natural 
symmetries. The next step is to show that — at least in some "thermodynamic limit"— 
a random Gibbs configuration will approximate a free-energy-minimizing function like the 
ones described above. 

Another problem, which has no analog in the deterministic, non-atomic classical theory, is 
the investigation of local statistics of a physical system. How likely are particular microscopic 
configurations of atoms to occur as sub-configurations of a larger system? How are these 
occurrences distributed? To what extent is matter homogenous throughout small but non- 
microscopic regions? Our solutions to these problems will involve large deviations principles, 
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which we precisely define later on. 

Finally, we want to investigate more directly the connections between the Gibbs potential 
$ and the kinds of behavior that can occur in these small but non-microscopic regions. This 
will require us to ask, given $, what are the "gradient phases" (i.e., the ergodic gradient 
Gibbs measures with finite specific energy) fj, of a given slope? Does the /i-variance of the 
height difference of points n units apart remain bounded independently of n or does it tend 
to infinity with nl When is the surface tension function a (defined precisely in Chapter 0J) 
strictly convex? 

Before we state our results precisely and describe some of the previous work in this area, 
we will need several definitions. While we attempt to make our exposition relatively self- 
contained — and define the terms we use precisely — we will also draw heavily from the results 
in some standard texts: Sobolev Spaces by Adams [Tj and recent extensions by Cianchi ([TI]. 
[To] . [TO]); Large Deviations Techniques and Applications by Dembo and Zeitouni [22]; Large 
Deviations by Deuschel and Stroock [26J; and Gibbs Measures and Phase Transitions by 
Georgii [13] ■ We will carefully state, if not prove, the outside theorems we use. 

1.1 Random surfaces and gradient Gibbs measures 
1.1.1 Gradient potentials 

The study of random functions from the lattice 7L d to a measure space (E, £) is a central 
component of ergodic theory and statistical physics. In many classical models from physics 
(e.g., the Ising model, the Potts model, Shlosman's plane rotor model), E is a space with a 
finite underlying measure A, £ is the Borel cr-field of E, and 4>(x) has a physical interpretation 
as the spin (or some other internal property) of a particle at location x in a crystal lattice. 
(See e.g., jlSj-) In the models of interest to us, (E, £) is a space with an infinite underlying 
measure A — either R m with Lebesgue measure or Z m with counting measure — where £ is the 
Borel cr-algebra of E and 4>(x) usually has a physical interpretation as the spatial position of 
a particle (or the vertical height of a phase interface) at location x in a lattice. For example, 
if m = d = 3, (j) could describe the spatial positions of the components of an elastic crystal; 
if m = 1 and d = 2, could describe the solid-on-solid or Ginzburg-Landau approximations 
of a phase interface [30j . 

Throughout the exposition, we denote by Q the set of functions from Z d to E and by 3 
the Borel cr-algebra of the product topology on Q. If A C Z d , we denote by 5Fa the smallest 
cr-algebra with respect to which 4>{x) is measurable for all x G A. We write Ta = &z d -A- We 
write A CC Z d if A is a finite subset of Z d . A subset of Q is called a cylinder set if it belongs 
to for some A CC Z d . Let 5F be the smallest cr-algebra on Q containing the cylinder sets. 
We write T for the intersection of Ta over all finite subsets A of Z d ; the sets in T are called 
tail-measurable sets. 

We will also always assume that we are given a family $ of measurable potential functions 
$a : ^ l— * ^ U {oo} (one for each finite subset A of Z d ); each $a is ?a measurable. We will 
further assume that $ is invariant under the group G of translations of Z d by members of 
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some rank-<i lattice L — i.e., if s G £, then <£>a+ s (0s) = $a(0), where S is defined by s (i) = 
<f)(i — s). (In many applications, we can take L = Z d .) We also assume that $ is invariant 
under a group r of measure-preserving translations of E — i.e., $a(0) — &a(t~4>), where r<p 
is simply defined by (r<j))(x) = r(0(x)). Potentials $ satisfying the above requirements are 
called x r-invariant potentials or L x r-invariant potentials. For all of our main results, 
we will assume that r is the full group of translations of Z m or IR m ; in this case, each $a(0) 
is a function of the gradient of 0, written V0 and defined by 

V0(x) = (<j>(x + ei) - (j)(x), <j)(x + e 2 ) - (j>(x), . . . , <j>{x + e d ) - 4>{x)), 

where e^ are the standard basis vectors of Z d . In this setting, we will refer to L x r-invariant 
potentials as L -periodic or L -invariant gradient potentials. We use the term shift-invariant 
to mean ^-invariant when L = 7L d . In some of our applications, we also restrict our attention 
to nearest-neighbor potentials, i.e., those potentials $ for which $a = unless A is a single 
pair of adjacent vertices in Z d . We say that $ has finite range if there exists an r such that 
$a = whenever the diameter of A is greater than r. For each finite subset A of Z d we also 
define a Hamiltonian: H^cfi) = X1aua^0 ^*a(0), where the sum is taken over finite subsets 
A of Z d . 

We define the interior Hamiltonian of A, written H^((f>), to be: 

AcA 

This is different from H\ because the former sum includes sets A that intersect A but are 
not strictly contained in A. On the other hand, is 5F A measurable, which is not true of 
H^. (This i?A i s sometimes called the free boundary Hamiltonian for A.) 

1.1.2 Gibbs Measures 

To define Gibbs measures and gradient Gibbs measures, we will need some additional nota- 
tion. Let (X, X) and (Y, y) be general measure spaces. A function n : X x Y i— > [0, oo] is 
called a probability kernel from (Y, V) to (X, X) if 

1. 7r(- is a probability measure on (X, X) for each fixed y EY, and 

2. 7r(A|-) is ^-measurable for each fixed A e X. 

Since a probability kernel maps each point in Y to a probability measure on X, we may 
interpret a probability kernel as giving the law for a random transition from an arbitrary 
point in Y to a point in X. A probability kernel maps each measure \i on (Y, V) to a measure 
/i7r on (X, X) by 

At7r(A) = / n(A\-)djji. 
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The following is a probability kernel from (Q, Ta) to (Q, T); in particular, for any fixed As J, 
it is a Tv-measurable function of 0: 



(When the choice of potential <£> is clear from context, we write 7* as 7a-) In this expression, 
Z\((/)) (which is also Ta measurable) is defined as follows: 



where d<p(x) is the underlying (Lebesgue or counting) measure on E. Informally, 7a is a 
random transition from Q to itself that takes a function and then "rerandomizes" within 
the set A. 

We say has finite energy if $a(0) < 00 for all A CC 7L d . We say is ^-admissible if 
each Zx{4>) is finite and non-zero. Given a measure fi on (fi, T), we define a new measure 
HI A by 



We say a probability measure fx on (ft, T) is a CzWs measure if /i is supported on the set of 
^-admissible functions in Q and for all finite subsets A of Z d , we have /17A = fx. (In other 
words, fx is Gibbs if and only if 7a describes a regular conditional probability distribution, 
where the /i-conditional distribution of the values of cf>(x) for x G A is given by 7a(-|0).) 

A fundamental result in Gibbs measure theory is that for any <3>, the set of 0-invariant 
Gibbs measures is convex and its extreme points are B-ergodic. (See, e.g., Chapters 14 
of More details also appear in Chapter El of this text.) Since 6 is understood to be 
the group of translations by a sublattice £ of Z d , we will also use the terms L -invariant 
and L-ergodic. In physics jargon, the £-ergodic measures are the pure phases and a phase 
transition occurs at potentials <£> which admit more than one £-ergodic Gibbs measure. 

1.1.3 Gradient Gibbs Measures 

Let r be the group of translations of E, and let T T be the cr-algebra containing r-invariant 
sets of T; this is the smallest cr-algebra containing the sets of the form {0|0(y) — 0(x) G 8} 
where x, y G Z d and § G £. In other words, T T is the subset of T containing those sets that 
are invariant under translations 0^0 + 2 for z G E. (Similarly, we write T^ = T\ H T r and 
Ta = Ta H T T .) Let $ be an ^-invariant gradient potential. Since, given any A G T r , the 
kernels 7*(.A|0) are T r -measurable functions of 0, it follows that the kernel sends a given 
measure \i on (Q, T r ) to another measure ^7* on (f2,T r ). A measure \i on (f2, T r ) is called 
a gradient Gibbs measure if it is supported on admissible functions and ^7* = yU for every 
fx. Note that this is the same as the definition of Gibbs measure except that in this case the 
cr-algebra that is different. 
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Clearly, if /i is a Gibbs measure on (Q, 3), then its restriction to 5F T is a gradient Gibbs 
measure. A gradient Gibbs measure is said to be localized or smooth if it arises as the 
restriction of a Gibbs measure in this way. Otherwise, it is non-localized or rough. (Many 
natural Gibbs measures are rough when d G {1,2}; for example, all the ergodic gradient 
Gibbs measures of the continuous, nearest-neighbor Gaussian models in these dimensions 
are rough — see, e.g., jlSj ) Moreover, the restriction of \i to 3 T may be ^-invariant even 
when fj, itself is not. 

Denote by 3 rr ) the set of ^.-invariant probability measures on (Q, < J T ) and by 

S,c(fi, 5" r ) the set of ^-invariant gradient Gibbs measures. We say that a // G Pc(fi,3 rr ) 
has finite slope if fi(4>(y) — </>(x)) is finite for all pairs x, y G Z d . (Throughout the text, we use 
the notation = / n One easily checks that there is a unique m x d matrix 

it such that (i(<f>(y) — 4>{x)) = u(y — x) (where u(y — x) denotes the matrix product of u 
and (y — x)) whenever y — x G £. In this case, we call u the slope of /i, which we write 
as S(fjL). Analogously to the non-gradient case, the extreme points of /i G J ) £,(f2,3 : " r ) are 
called L-ergodic gradient measures and the extreme points of S(^,3 rr ) are called extremal 
gradient Gibbs measures. We discuss these notions in more detail in Chapter El Although 
the term "phase" has many definitions in the physics literature, when a full rank sublattice 
L of Z d is given, we will always use the term gradient phase to mean an £-ergodic gradient 
Gibbs measure with finite specific free energy (a term we define precisely in Chapter^. A 
minimal gradient phase is a gradient phase of some slope u for which the specific free energy 
is minimal among the set of all slope u, ^-invariant gradient measures. 

1.1.4 Classes of periodic gradient potentials 

When m = 1, we say a potential $ is simply attractive if $ is an ^-invariant nearest-neighbor 
gradient potential such that for each adjacent pair of vertices x and y, with x preceding 
y in the lexicographic ordering of Z d , we have ${ XiJ/ }((/)) = V Xty ((f)(y) — 4>(x)), where the 
V X)V : K i— > [0, oo] are convex and positive, and lim^oo V X) y(r)) = lim^.oo V Xty (rj) = oo. As 
before, we assume here that £ is a full-rank sublattice of Z d . For convenience, we will always 
assume V X}V — if x and y are not adjacent or x does not precede y in the lexicographic 
ordering of Z d . When we refer to the nearest neighbor potential for "an adjacent pair x, y" 
we will assume implicitly that x precedes y in lexicographic ordering. 

Note that each V x ^ y has a minimum at at least one point i] G M. In many applications, we 
can assume r] = 0; in this case, the requirement that V X)V be convex implies that the model 
is "attractive" or "ferromagnetic" in the sense that the energy is lower when neighboring 
heights are close to one another than when they are far apart. 

We chose to invent the term "simply attractive potential" because the obvious alterna- 
tives were either too long ("convex nearest-neighbor periodic difference potential") or too 
overloaded and/or imprecise ("ferromagnetic potential," "elastic potential," "anharmonic 
crystal potential," "solid-on-solid potential"). Elsewhere in the literature, the latter terms 
have definitions that are more general or more specific than ours, although they usually agree 
in spirit. 

Also, when m = 1, we say $ is isotropic if for some V : M. i — > M. (which must be 
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positive, convex, and even — i.e., V{rj) = V(— 77)) we have V X) y(r]) = V(rj) for all adjacent 
pairs x, y G Z d . We say $ is Lipschitz if there exist 771,772 £ M such that for all adjacent 
x,y G Z d , we have V^* (77) = 00 whenever r\ < rji or r\ > r] 2 . We will frequently use the 
following abbreviations: 

1. SAP: Simply attractive potential 

2. ISAP: Isotropic simply attractive potential. We write $y to denote the isotropic simply 
attractive potential in which each V XyV = V 

3. LSAP: Lipschitz simply attractive potential 

Most of the simply attractive models discussed in the statistical physics literature are 
either Lipschitz and have E = Z (e.g., height function models for perfect matchings on 
lattice graphs jSZ] and square ice jlj) or isotropic (e.g., linear solid-on-solid, Gaussian, and 
Ginzburg-Landau models, [2H]). 

We say that a potential $ strictly dominates a potential \I/ if there exists a constant 
< c < 1 such that \H%(<p)\ < c\H%(<j))\ for all A CC Z d and <p e ft. (If m > 1, we replace 
the absolute value signs in this definition by the Euclidean norm.) 

When m > 1, we can write any E f2 as (pie 1 + fae 2 + . . . + </> m e m where the fa are real 
valued (or integer valued) and the e % are the standard basis vectors in E. In this setting, we 
say that $ is an SAP (resp., ISAP, LSAP) if it can be written as $(0) = E $i{fa)i where 
each of the is a one-dimensional simply attractive potential. For any m, a perturbed SAP 
(resp., perturbed LSAP, perturbed ISAP) is an XL-periodic gradient potential of the form <J> + \P 
where $ is an SAP (resp., LSAP, ISAP), \1/ has finite range, and $ strictly dominates 

Note that when m > 1, our class of simply attractive potentials is rather restrictive; each 
one can be decomposed into a sum of m simply attractive potentials, one in each coordinate 
direction. The class of perturbed SAPs is much larger. For example, if $ is a nearest neighbor 
gradient potential defined by <3>{ Xiy }(0) = V Xi y((j)(x) — <fi(y)) when m — 1, then we can define 
a radially symmetric higher dimensional potential $ by Q x ^ y (ip) = V x>y (\ \i/j(x) — ip(y)\\), for 
t/> : Z d 1— > M m . If the V^ !3/ are increasing on [0, 00) and for some 6 > satisfy the condition that 
V x ^{mr]) < bV x ^ y (r]) for all 77, then $ is a perturbed simply attractive potential. (Observe 
that ^ {Xi y } (ip) = bj^ili Vx, y (\^(x)i-ip(y)i\) is simply attractive. Note that msup-^^ \r]i\ > 

IMI > supj^M. Thus, foE^d^D > ^,v(IMI) > E^,«(M); ^ foiiows'that m > 

$ > and ^ strictly dominates \? — $.) It is also easy to see that the sum of a perturbed 
simply attractive potential and any bounded potential is (at least after adding a constant) 
a perturbed simply attractive potential. 

SAPs and perturbed SAPs are (respectively) the most general convex and not-necessarily- 
convex potentials we consider. Most of the constructions in Chapters El El apply to all 
perturbed SAPs and are valid for any E = R m or E = Z m . The results of Chapter are 
analytical results used in later chapters; most of them are stated in terms of ISAPs with 
m = 1. The variational and large deviations principle results of Chapters El and [7| apply to 
perturbed ISAPs and perturbed LSAPs and are valid for any E = M m or E = Z m . (We will 
actually prove the results first for ISAPs and LSAPs when m — 1 and then observe that 
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extensions to perturbed versions and to general m are straightforward.) All of the surface 
tension strict convexity and gradient phase classifications in Chapters |H1 and El apply to SAPs 
in the case m = 1. 

1.2 Overview of remaining chapters 

1.2.1 Specific free energy and surface tension 

In Chapter|21we define the specific free energy of a measure [i G ^(O, 3 TT ) (denoted SFE(fi)) 
and prove several consequences of that definition. In particular, we show that if /i has slope 
u and has minimal specific free energy among measures of slope u, then /i is necessarily 
a gradient Gibbs measure. (This is the first half of our variational principle.) We discuss 
ergodic and extremal decompositions in Chapter El and prove that SFE(fi) can be written 
as the /^-expectation of a particular tail-measurable function that is independent of fi. (In 
particular, SFE is affine.) These definitions and results are analogous to those of the 
standard reference text [33], although the setting is different. In Chapter 0J we define the 
surface tension a{u) to be the infimum of SFE(fi) over all slope-w measures fi G ^(Q, 9 rr ). 
The pressure of a potential $ is the infimum of the values assumed by a and denoted -P( ( &). 
Let U<p be the interior of the set of slopes u for which oiu) < oo. We will see that whenever 
$ is a perturbed SAP, the set £/$ is either all of K^*" 1 or the intersection of finitely many half 
spaces. Several equivalent definitions of specific free energy and surface tension are contained 
in Chapter |3 

1.2.2 Orlicz-Sobolev spaces and other analytical results 

In Chapter we define Orlicz-Sobolev spaces and cite a number of standard results about 
them (compactness of embeddings, equivalence of spaces, miscellaneous bounds, etc.) from 
[H|, [T3], [TH], P, and [70]. The Orlicz-Sobolev space theory will enable us to derive (in 
some sense) the strongest possible topology on surface shapes (usually a topology induced 
by the norm of an Orlicz-Sobolev space) in which our large deviations principles on surface 
shapes apply. 

For example, this will enable us to prove that our large deviations principles for the two- 
dimensional Ginzburg-Landau models apply in any LP topology with p < oo, whereas these 
results were only proved for the L 2 topology in jJUJ and [23]. This allows us in particular 
to produce stronger concentration inequalities — to show that typical random surfaces are 
"close" to free-energy minimizing surfaces in an LP sense instead of merely an L 2 sense. One 
of the reasons that Orlicz-Sobolev space theory was developed was to provide tight conditions 
for the existence of bounded solutions to PDE's and to variational problems involving the 
minimization of free energy integrals; so it is not too surprising that these tools should be 
applicable to the discretized/randomized versions of these problems as well. 
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1.2.3 Large deviations principle 

In Chapter |H1 we derive several equivalent definitions of the specific free energy and surface 
tension. We also complete the proof of the variational principle (for perturbed ISAP and 
discrete LSAP models), which states the following: if \i G 7^(0,, 3^ T ) is £-ergodic and has 
finite specific free energy and slope u, then \x is a gradient Gibbs measure if and only if 
SFE(fi) = o~{u). In particular, every gradient phase of slope u is a minimal gradient phase. 
In Chapter [7| we derive a large deviations principle for normalized height function shapes and 
"empirical measure profiles." Following standard notation (see, e.g., Section 1.2 of 22 ), we 
say that a sequence of measures fi n on a topological space (X, X) satisfy a large deviations 
principle with rate function I and speed n d if / : X — > [0, oo] is lower-semicontinuous and for 
all sets B G X, 

— inf I{x) < liminf n~ d log /i n (B) < limsupn -d log/i n (-B) < — inf_/(x). 

xeB° n-*oo ' n-+ao ' i£B 

Here B° is the interior and B the closure of B. Roughly speaking, we can think of I{x) 
as describing the exponential "rate" (in terms of n d ) at which n n {B x ) decreases when B x 
is a very small neighborhood of x. Also, note that if / obtains its minimum at a unique 
value xq G X and B is any neighborhood of xo, then fj, n (X\B) decays exponentially at 
rate mf xe x\B (whenever this value is non-zero). We refer to bounds of this form as 
concentration inequalities, since they bound the rate at which \i n tends to be concentrated 
in small neighborhoods of xq. 

By choosing the topological spaces appropriately, we will produce a large deviations 
principle on random surface measures which — although its formulation is rather technical — 
encodes a great deal of information about both the typical local statistics and global "shapes" 
of the surfaces. Though we defer a complete formulation until Chapter [7J a rough but almost 
complete statement of our main large deviations result is the following. Let D be a bounded 
domain in M. d (satisfying a suitable isoperimetric inequality), and write D n = nD n Z d . Let 
$ be a perturbed ISAP or LSAP, and use Hfj to define a Gibbs measure //„ on gradient 
configurations on D n . Given <p n : D n \— > E, we define an empirical profile measure R<p n , n G 
?(D x fi) as follows: 

R<p n ,n = / 5 {Xi e lnxi( p n )dx, 
Jd 

(where (9 y (p)(x) := <p(x + y)). Informally, to sample a point (x,a) from R<f, nt n, we choose 
x uniformly from D and then set a = d\ nx \(p n (where <p n is defined to be zero or some 
other arbitrary value outside of D n ), i.e., a is <p n shifted so that the origin is near x. Also, 
using (j) n , we will define a function n : D i— > IR m by interpolating the function ^0 n (nx) to a 
continuous, piecewise linear functions on D; each such <ft n will be a member of an appropriate 
Orlicz-Sobolev space L A (actually, a slight extension Lq of L A to include functions defined 
on most but not all of D) where A = V , a function we define later. 

Let p n be the measure on IP(D x Q) x L A induced by //„ and the map n i— > (R<f> n , n , 4>n)- 
We say a measure \x G 7(D x Q) is L-invariant if //(•, Q) is Lebesgue measure on D and for 
any D' C D of positive Lebesgue measure, (a(D', •) is an ^-invariant measure on Q. Given 
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any subset D' of D with positive Lebesgue measure, we can write S(fi(D f , •)) for the slope 
of the measure fi(D f , -)/fi(D' x Q) (we have normalized to make this a probability measure) 
times ji{D' x Vt). The map D' — > S(fi(D',-)) is a signed, vector-valued measure on D, 
and in particular, when ip is smooth, we can define integrals J tp(x)S(fi(x, -))dx, which we 
expect to be the same as the integral of the gradient of the limiting surface shape /, i.e., 
/ ip(x)'Vf(x)dx. In fact, we show that the p n satisfy a large deviations principle with speed 
n d and rate function 



in an appropriate topology on the space 'J'(D x fl) x Lq. Contraction to the first coordinate 
yields an "empirical profile" large deviations principle; contraction to the second coordinate 
yields a "surface shape" large deviations principle. Analogous results apply in the presence 
of boundary restrictions on the <fi n . 

In Section I7.3.3| we will see that the introduction of "gravity" or other "external fields" 
to our models alters the rate function / in a predictable way; by computing the rate func- 
tion minimizer of the modified systems, we can describe the way "typical surface behavior" 
changes in the presence of external fields. In fact, the ease of making changes of this form 
is one of the main appeals of the large deviations formalism in statistical physics in general: 
the rate function tells not only the "typical" macroscopic behavior but also the relative free 
energies of all of the "less likely" behaviors which may become typical when the system is 
modified. 

1.2.4 Surface tension strict convexity and Gibbs measure classifi- 



The results in Chapters E] and Chapter EH pertain only to the case that m = 1 and $ is 
a simply attractive potential. In Chapter El we introduce a geometric construction called 
cluster swapping that we use to prove that the surface tension o is strictly convex and to 
classify gradient Gibbs measures. In some cases, these results will allow us to prove the 
uniqueness of the minimum of the rate function of the LDP derived in Chapter [7] — and 
hence, also some corresponding concentration inequalities. For every u £ U$, there exists 
at least one minimal gradient phase fi u of slope u. We prove that each of the following is 
sufficient to guarantee that this minimal gradient phase is unique: 

1. E = R 

2. There exists a rough minimal gradient phase of slope u. 

3. One of the d components of u is irrational. 




SFE(/j,(D, •)) - P($) fi is iL-invariant and S{ji(x, •)) = V/(x) 



otherwise. 



as a distribution 



cations 
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Each of the first two conditions also implies that fi u is extremal. Whenever a minimal 
gradient phase of slope u fails to be extremal, it is necessarily smooth. We show that 
the extremal components of fi u can be characterized by their asymptotic "average heights" 
modulo 1, and that every smooth minimal gradient phase is characterized by its slope and 
its "height offset spectrum" — which is a measure on [0, 1) that is ergodic under translations 
(modulo one) by the inner products (u,x), for x G £. We give examples of models with 
non-trivial height offset spectra and minimal gradient phase multiplicity — a kind of phase 
transition — that occur when d > 3, E = Z, and u is rational. 

In Chapter El we specialize to models in which $ is simply attractive, d = 2, and E — Z; 
many classical models (e.g., perfect matchings of periodic weighted graphs, square ice, certain 
six-vertex models) belong to this category. These models are sometimes used to describe the 
surface of a crystal at equilibrium. We show that in this setting, the height offset spectra of 
smooth phases are always point masses in [0, 1). In this setting, /i n is unique and extremal 
for every u G and the slopes of all smooth minimal phases (also called crystal facets) lie 
in the dual lattice £> of L. 

1.2.5 Differences from previous work 

Before reading on, the reader may wish to know which aspects of our research we would 
expect a researcher with years of experience in large deviations theory and statistical physics 
to find new or surprising. 

For readers who have studied the variational principle in the context of, say, the Ising 
model, our random surface formulation — that an ergodic gradient measure is Gibbs if and 
only if it minimizes specific free energy among measures of that slope — may not come as a 
huge surprise. Indeed, it may surprise the reader that nobody had formulated and proved 
this fundamental result before. 

The fact that the large deviations principle extends to empirical measure profiles requires 
many technical advances, but the result itself is also not shocking (in light of the many similar 
results known for, say, the Ising model). Readers who learned Sobolev space theory a couple 
of decades ago may be somewhat surprised to learn how much stronger, simpler, and more 
intuitive the theory has become — and how much of it seems to have been custom-made 
for our research. Instead of imposing lots of ad hoc conditions, we can now derive large 
deviations principles in the "right" topologies and with the "right" boundary conditions 
while citing most of the needed analytical results from other sources. 

But in our view, the most surprising aspect of our large deviations principles is the 
generality in which we prove uniqueness of the rate function minimizer. This uniqueness is a 
consequence of two key results: the strict convexity of o and the uniqueness of the gradient 
Gibbs measure of a given slope. Both of these results are proved in Chapters 8 and 9 using 
the variational principle and a new geometric construction called "cluster swapping." 

Before our work, some researchers suspected that if V failed to be strictly convex, then 
the surface tension a corresponding to the ISAP $y would also fail to be strictly convex. In 
particular, it was unknown whether the surface tension corresponding to the linear solid-on- 
solid model V(x) = \x\ was everywhere strictly convex in both the discrete and continuous 
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height versions. 

Also, although the uniqueness of the gradient Gibbs measure of a given slope was known 
for Ginzburg-Landau V0 models and conjectured for some discrete models (see jTH], [HI], 
and the next section), our statement — particularly in the discrete case E = Z — is much more 
general than had been conjectured. 

Finally, our discrete model analysis of the smooth-phase/rough-phase distinction in Chap- 
ters 8 and 9 is new. The "height offset spectrum" decomposition for general d, and the fact 
that when d = 2 all smooth phases have slopes in the dual lattice of £, were both, to our 
knowledge, unexpected. Indeed, the dimer model analog of the smooth phase classification 
theorem is one of the more surprising qualitative results in j^lj. In additional to cluster swap- 
ping, our proofs of these results use, in a new way, the FKG inequality and the homotopy 
theory of the countably punctured plane. 

1.3 Two important special cases 

Special cases of what we call simply attractive potentials have been very thoroughly studied 
in a variety of settings. In this section, we will briefly review relevant facts about two of the 
most well-understood random surface models: domino tiling height functions (here E = Z) 
and the Ginzburg-Landau V0-interface models (here E — R). Each of these models is the 
subject of a sizable literature, and each has features that make it easier to work with than 
general simply attractive or perturbed simply attractive potentials. 

An exhaustive survey of the myriad physical, analytical, probabilistic, and combinatorial 
results available for even these two models — let alone all simply attractive models — is 
beyond the scope of this work. But we will mention a few of the papers and conjectures that 
directly inspired our results and provide pointers to the broader literature. See the survey 
papers |55] . jUj, and [IS] for more details. 

1.3.1 Domino tilings 

Though we mentioned the modeling of solids and phase interfaces as one motivation for our 
work, the models we describe have been used for other purposes as well. When E — Z and 
$ is chosen appropriately, the finite-energy surfaces G Q correspond to the so-called height 
functions that are known to be in one-to-one correspondence with spaces of domino tilings, 
square ice configurations, and other discrete statistical physics models. 

We will now explicitly describe a well-known correspondence between the set of domino 
tilings of a simply connected subset R of the squares of the Z 2 lattice and the set of height 
functions from the vertices of R to Z that satisfy certain boundary conditions and difference 
constraints. Let e : Z 2 i— >■ {0, 1,2,3} be such that if i = G Z 2 , then e(ii , i 2 ) assumes 

the values 0, 1, 2, and 3 as the value of i modulo 2 is respectively (0,0), (0, 1), (1, 1), and 
(1,0). Given a perfect matching of the squares of Z 2 (which we can think of as a "domino 
tiling", where a domino corresponds to an edge in the matching), a height function ip on the 
vertices of Z 2 is determined, up to an additive constant, by the following two requirements: 
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1. ip(x) = e(x) mod 4 

2. If x and y are neighboring vertices, then \ip(x) — tp{y)\ = 3 if the edge between them 
crosses a domino and 1 otherwise. 

In the height functions thus defined, the set of possible values of ip{x) depends on the parity 
of x; in order to describe these height functions as the finite-energy surfaces of a Gibbs 
potential, we will instead use a slight modification: <f>(x) = Now the set of possible 

heights at any vertex is equal to Z. The set of height functions <fi of this form that arise from 
tilings are precisely those functions : Z 2 i— > Z for which if*(0) is finite for all A, where $ 
is the LSAP determined by the following nearest neighbor potentials: 

r) = 

e(x) > e(y) and r\ — 1 
e(x) < e(y) and rj = — 1 
otherwise. 

Because of this correspondence, we may think of a domino tiling chosen uniformly from the 
set of all domino tilings of a simply-connected subset R of the squares of Z 2 as a discretized, 
incrementally varying random surface. We can also think of domino tilings as perfect match- 
ings of a bipartite graph. It is not hard to compute the number of perfect matchings of a 
bipartite graph using the permanent of an adjacency matrix. Kastelyn observed in 1965 that 
by replacing the l's in the adjacency matrix with other roots of unity, one can convert the 
(difficult) problem of permanent calculation into a (much easier) determinant calculation. 
The rich algebraic structure of determinants has rendered tractable many problems that 
appear difficult for more general families of random surfaces. 

In one recent paper [TH], Cohn, Kenyon, and Propp proved the following: Suppose R n is 
a sequence of domino-tilable regions such that the boundary of -R n converges to that of a 
simple region R in M 2 . Let fi n be the uniform measure on the set of tilings of R n . Each \i n 
also induces a measure on the set of possible height functions <p n on the vertices of R n ; the 
values of <p n on the boundary of R n are determined by the shape of R n independently of the 
tiling. Suppose further that the boundary height functions ^cj) n {nx) converge (in a certain 
sense) to a function f defined on the boundary of R. Then, [TH] shows that as n gets large, 
the normalized height functions of tilings chosen from the \x n approach the unique Lipschitz 
(with respect to an appropriate norm) function / that agrees with / on the boundary of R 
and minimizes a surface tension integral 

I if) = I cr(Vf(x))dx. 

JR 

In fact, their results imply that this surface tension integral is a rate function for a large 
deviations principle (under the supremum topology) that holds for a sequence of random sur- 
face measures v n — derived from the /i„ by standard interpolations — on the space of Lipschitz 
functions on R that agree with /o on the boundary. 



(0 
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These authors also explicitly describe an ergodic gradient Gibbs measures \i u of each 
slope u = (ui,u 2 ) inside the set U$ = {u : \ui\ + | w 2 1 < 5}; they show that only zero-entropy 
gradient ergodic Gibbs measures exist with slopes on the boundary of £7$, and no Gibbs 
measures exist with slopes outside the closure of U$. Since every tiling determines a height 
function up to an additive constant, a gradient Gibbs measure in this context is equivalent 
to a Gibbs measure on tilings, where in both cases we can take L = 2Z, 2 . They conjecture 
that for each u £ U$, \i u is the only gradient phase of slope u. A similar conjecture appears 
in an earlier paper by Cohn, Elkies, and Propp |18j . 

We will resolve this conjecture in Chapter El We also resolve another of their conjectures 
(concerning the local probability densities of domino configurations in large random tilings) 
as a consequence of our large deviations principle on profiles in Chapter [7| We extend these 
results, as well as the large deviations principle on random surface shapes produced in [TUj . 
to more general families of simply attractive random surfaces. 

Using Kastelyn determinants, the authors of ^F3\ were able to compute the surface tension 
cr and the ergodic Gibbs measures fj, u exactly in terms of special functions, and their large 
deviations results rely on these exact computations. See [S3] for a generalization of these 
computations to perfect matchings of other planar, doubly periodic graphs by the author 
and two co-authors. These authors use algebraic geometric constructions called amoebae to 
"exactly solve" the dimer model on general weighted doubly periodic lattices by explicitly 
computing a and the local probabilities in all of the \x u . The characterization of ergodic 
Gibbs measures on perfect matchings given in [S3] makes use of a few results from Chapters 
8 and 9 of this text, including the uniqueness of a measure \i u of a given slope. While we 
prove for a much more general class of two-dimensional models that all smooth phases have 
slopes in the dual lattice £/ of the lattice L of translation invariance, the authors in [U3] use 
the exact solvability to determine precisely which of these slopes admit smooth phases. The 
smooth phases in this context are in correspondence with cusps of the surface tension, and 
depending on the way the edges of the doubly periodic planar graph are weighted, some, all, 
or none of the Gibbs measures fi u corresponding to u £ L' will actually be smooth. 

Currently, it seems unlikely that the techniques of [U3] can be extended to exactly solve 
more general random surface models — particularly those in dimensions higher than two; but 
we will prove enough qualitative results (such as the strict convexity of a and the gradient 
Gibbs measure classification) to show that the large deviations theorems apply in general. 

1.3.2 Ginzburg-Landau V0-interface models 

Recent papers by Funaki and Spohn [40J and Deuschel, Giacommin, and Ioffe derive 
similar results for a continuous generalization of the harmonic crystal called the Ginzburg- 
Landau V '(f) '-interface model. These models use ISAPs in which E — R and V XtV = V for 
all adjacent x, y. Here V : R 1— > R is convex, symmetric, and C 2 , with second derivatives 
bounded above and below by positive constants. Such potentials V are bounded above and 
below by quadratic functions — and we may think of them as "approximately quadratic" 
generalizations of the (Gaussian) harmonic crystal, for which V{rj) = fit] 2 . 

Calculations for these models typically make use of the fact that Gibbs measures in these 
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models are stationary distributions of infinite-dimensional elliptic stochastic different iable 
equations (see, e.g., [73], for descriptions and more references). Given a configuration 0, the 
"force" on any given "particle" (and hence the stochastic drift of that particle's position) is 
within a constant factor of what it would be if the potential were Gaussian; and the rate 
at which a pair of Gibbs measures converges in certain couplings is also within a constant 
of what it would be in a Gaussian model. Although the calculations in, for example, [HI] 
or j2S], are still rather complicated, they appear to be simpler than they would be for 
general simply attractive models. These authors also derive static Gibbs measure results 
as corollaries of more general dynamic results. For example, Funaki and Spohn prove the 
uniqueness of gradient phases of a given slope u using a dynamic coupling [JU]. Although 
the Gibbs measure classifications and surface shape large deviations principles are proved in 
these papers, our large deviations principle for profiles and our variational principle are new 
results for V0-interface models. Also, as mentioned earlier, we derive our large deviations 
principles with respect to stronger topologies than [3U] and [25] . 

See, e.g., Giacomin's survey papers ([IB], jUj, [IS]) for many more references about 
Ginzburg-Landau V0-interface models, including wetting transitions, entropic repulsion, 
roughening transitions, etc. 
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Chapter 2 

Specific free energy and variational 
principle 



The notion of specific free energy lies at the heart of all of our main results. Although 
definitions and applications of specific free energy are well known for certain families of 
shift-invariant measures on (see, e.g., Chapter 14 and 15 of 43 ), we need to check 

that these notions also make sense for our ^-invariant gradient measures on (fl,^ T ). In 
this chapter, we provide a definition of the specific free energy of an ^-invariant gradient 
measure and prove some straightforward consequences, including the first (and easier) half 
of the variational principle. We will cite lemmas directly from reference texts |33] and [22] 
whenever possible. First, we review some standard facts about relative entropy and free 
energy. 

2.1 Relative entropy review 

Throughout this section, we let (X, X) be any Polish space (i.e., a complete, separable metric 
space endowed with the metric topology and the Borel a-algebra X), fi and v any probability 
measures on (X, X), and A a sub a-algebra of X. Write fi << v if \x is absolutely continuous 
with respect to v. The relative entropy of \i with respect to v on A, denoted JCy^yU, u), is 
defined as follows: 



where f A is the Radon-Nikodym derivative of fi with respect to v when both measures are 
restricted to A. (We often write % to mean Note that this definition still makes sense 
if v is a finite (positive), non-zero measure (not necessarily a probability measure). If a > 
and /j, is as above with jj, « v on A, we have: 




/i << v on A 
otherwise, 



, H A {ii,av) = au(-f A log(-f A )) = ^(/i, v) - logo. 



We now cite Proposition 15.5 of [43J: 
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Lemma 2.1.1 If fi and v are 'probability measures and a > 0, then 

1. O-Cji^fiju) > (and thus "Ha(h, av ) > ~ logaj 

2. 3~Gi(/i, au) = — log a if and only if fi = v on A 

3. JOi(/i, au) is an increasing function of A 

4- 3~Ca(aa, v) is a convex function of the pair (/i, v) when fi ranges over probability measures 
and v over non-zero finite measures on (X, X) 

The following very important fact about relative entropy (Proposition 15.6 of will enable 
us to approximate relative entropy with respect to a subalgebra A by the relative entropy 
with respect to convergent subalgebras. Throughout this work, we denote by y(X, X) the 
set of probability measures on (X, X) . 

Lemma 2.1.2 Let fi, v e l ?(X,X) and let A n be an increasing sequence of subalgebras ofX, 
and A the smallest a -algebra containing U™ =1 A n . Then: 

lim "K^n, v) = Jt A {ti, v) = sup 3i An (^ v )- 

n^oo n 

Regular conditional probability distributions do not exist for general probability measures 
on general measure spaces. However, the following lemma states that they do exist for all 
of the spaces and measures that will interest us here. It also enables us to express the 
relative entropy of a measure on a product space X\ x X 2 as the relative entropy of the first 
component plus the expected conditional entropy of the second component given the first. 
Here, we let rj = (^1,^2) denote a generic point in X. (See Theorem D.3 and D.13 of [22 .) 

Lemma 2.1.3 Suppose X = X\ x X%, where each Xi is Polish with Borel o-algebra X». 
Let fii and v\ denote the projections of \i,v G ^(X, X) to X\. Then there exist regular 
conditional probability distributions fi m (-) and v^i/) on X2 corresponding to the projection 
map 7T : X 1 — ► X\. Moreover, the map: 

/ 7l ^^(^(.),^(-)):X 1 ^[0,oo] 

is Xi measurable and 

Hx{n,") = X Xl (lii,vi)+ [ ^(/i r?1 (-)^ r?1 (-))/^i(^i)- 

This result and the following simple corollary are the key observations behind the proof of 
the easy half of our variational principle (which states that if a slope u gradient measure has 
minimal specific free energy among measures of slope u, then it must be a gradient Gibbs 
measure). 
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Lemma 2.1.4 Let ii\ and v\ be probability measures on X\, and /j 2 and v 2 probability mea- 
sures on X 2 . Suppose that fi is a probability measure on Xi x X 2 with marginal distributions 
given by /ii and ji 2 , respectively. Then: 



If we assume that CK(/ii,z/i) < oo and ( }i{ y [i 2l v 2 ) < oo, then equality holds if and only if 
fi = fii ® A*2 ■ 

Proof By the previous lemma applied to v = v\ ® z/ 2 , we may assume that !K(//i, v\) < oo, 
and it is enough to show that 



Since j x fi Vl (-)^,i(dr]i) = fi 2 , this follows from Jensen's inequality and the convexity of 
Ji(-,u 2 ), stated in Lemma 12.1.11 This function is strictly convex on its level sets, hence 
the characterization of equality. I 

Finally, we will also be interested in the convergence of sequences of probability measures. 
Denote by T(X, X) the space of probability measures on (X, X). The weak topology on 
X) is the smallest topology with respect to which the map v \— > v(f) is continuous for 
all bounded continuous functions /. The r-topology is the smallest topology with respect 
to which v i— > v(f) is continuous for all bounded X-measurable functions / on X. The 
reader may check that this is also the smallest topology with respect to which v \— > v(A) is 
continuous for every A 6 X. In general, the r-topology is stronger than the weak topology. 
The two topologies coincide when (X, X) is discrete. We now cite two lemmas (Lemma 6.2.12 
with its proof and Lemma D.8 of [22 ): 

Lemma 2.1.5 Fix C G 1 and a finite measure v on (X, X); then the level set Mc, u of 
probability measures fi on (X, X) with IK(/x, v) < C is compact in the r-topology. 

Lemma 2.1.6 The weak topology is metrizable on CP(X, X) and makes '?(X, X) into a Polish 
space (i.e., a complete, separable metric space). 

From these lemmas, we deduce the following: 

Lemma 2.1.7 The r-topology restricted to a level set Mq, v is equivalent to the weak topology 
and hence also metrizable. In particular the compactness of the level sets (Lemma V2.1.5)) 
implies sequential compactness of the level sets in both topologies. 

Proof This is a well-known result (stated, for example, in the proof of Theorem 3.2.21 and 
Exercise 3.2.23 of [26J), but we sketch the proof here. It is enough to prove that the r- 
topology on Mc, u is contained in the weak topology on Mc, u - We can prove this by showing 
that if a measure fj, lies in a base set !B of the r-topology, then there exists a base set 23' of 



!K(/i, v x ® u 2 ) > "K{ni <g> fi 2 , v x <g> v 2 ) = v\) + 0<(ii 2 , v 2 ). 




22 



the weak topology with /i e S' C S. Precisely, we must show that if A G X, then for each 
e > and measure /x, there exists a bounded, continuous function / and an eo > such that 
\fJ>(f) ~ f^'(f)\ < e o implies \fJ>(A) — fjf(A)\ < e whenever fj, 1 G Mo- Note that: 

- ^)| < HA) - n(f)\ + - //(/)| + - l*'(A)\. 

We would like to show that by choosing e and / appropriately, we can force each of the 
right hand terms to be as small as we like. The second term is obvious. For the first term, 
it is enough to note that for any 5, we can find a positive continuous function / such that 
fi\f — 1a\ < S and < / < 1. (Simply let take a closed set A' with v\1a> — 1a| < define 
f(x) = 1 for x G A' and f(x) = sup(0, 1 — ad(A',x)) otherwise, where a > is sufficiently 
small and d(A',x) is the distance from A' to x.) For the third term, note that by taking 8 
and a small enough in the above construction, we can also assume that / and 1a are equal 
outside a set of //-measure at most 7 for any 7 > 0. Using the definition of relative entropy, 
it is not hard to see that for any 7' we can find a 7 small enough so that for each // G Mq iV 
we have fJ>'(B) < 7' whenever B G X is such that v(B) < 7; this puts a bound of 27' on the 
third term. I 

In particular, this lemma implies that the level sets are closed in both topologies, which 
implies the following: 

Corollary 2.1.8 For fixed v, the junction v) is a lower semicontinuous function on 
CP(X, X), endowed with either the weak topology or the r -topology. 

Our motivation for the last few lemmas is that, using these results, we will later define 
a topology on 3 rT ) (the topology of local convergence) with respect to which "specific 
relative entropy" and specific free energy have compact level sets. This will allow us to 
deduce, for example, that the specific free energy achieves its minimum on sets that are 
closed in this topology. And this will lead to proofs of the existence of gradient Gibbs 
measures of particular slopes. 

2.2 Free energy 

Let A be an "underlying" probability measure on X = R m (or Z m ) and V an X-measurable 
Hamiltonian potential for which Z = J x e~ v ^ \(dr]) is finite. We define the free energy of 
any measure \i on (X, X) as the following relative entropy: 

FE v {fi) = J{(/i,e- y A), 

where we use the convention that f\ is the measure whose Radon-Nikodym derivative with 
respect to A is / We will write FE v (fi) = FE(/j.) when the choice of potential is clear from 
the context. We can also write this expression as fi(V+\og f), where / is the Radon-Nikodym 
derivative of ji with respect to A. When they exist, we refer to /i(V) as the energy of ji and 
to —//(log /) as the entropy of fj, (which is —00 if ji is not absolutely continuous with respect 
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to Lebesgue measure). The probability measure \iy = Z~ x e~ v \ is called the Gibbs measure 
for the Hamiltonian V. The free energy of the Gibbs measure is simply — log Z, and the free 
energy of a general measure // can also be written !K(/x, /iy) ~ l°g Z. A trivial consequence of 
this fact and Lemma 12.1.11 is the following so-called finite dimensional variational principle: 

Lemma 2.2.1 The free energy of any probability measure fi on X is equal to or greater than 
that of \iy; equality holds if and only if \i = \iy. In other words, a measure is Gibbs if and 
only if it has minimal free energy, equal to —\ogZ. 

The following monotonicity is also an easy consequence of the definitions. 

Lemma 2.2.2 IfVi{rj) > V 2 (r]) for all x, then FE Vl (fi) > FE V2 ([i) for all fi. 

It will sometimes be useful to know that an upper bound on the free energy « allows us 
to put a lower bound on the amount of mass of \i that lies outside of a particular compact 
set. 

Lemma 2.2.3 If X is Z m or R m and Z = J x e~ v ™dr] is finite, then for every c > and 
every d, there exists a compact set S C X such that fi(S) < 1 — c implies FE ' (//) > d 
whenever \i is a probability measure on X . 

Proof From Lemma f2.2.H we know that if we require \i{X — S) = 1, then the minimal free 
energy \x can have is given by 

-log^ e- VM \(dr]^j . 
Similarly, if fi(X — S) = a, the minimal free energy is at least 

a log(a) + (1 — a) log(l - a) - a log / e- VM dr] - (1 - a) log / e^^drj. 

Jx-s Js 

(See Lemma f2. 1.31 ) If we choose S to be a large enough closed ball containing a given point 
such that f x ^ s e~ v ^dT] < 1/n, we can make the latter expression at least 

— log2 + alogn — (1 — a) log(Z — 1/n), 

which tends to oo with n. I 

Also, because free energy can be defined as a relative entropy, each of the lemmas proved in 
the previous section applies to free energy as well. 
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2.3 Specific free energy: existence via superadditivity 



We now return to our infinite dimensional setting. That is, (Q, 3 rT ) is the set of functions 
from Z d to E endowed with the cr-algebra 3* described in the introduction, and $ is an L x r 
invariant gradient potential. In this section, we use limits to give a definition of specific free 
energy for ^-invariant gradient measures on (Q, 3 rr ). We prove the existence of these limits 
using subadditivity arguments. 

Let fx \ be the measure on E^ obtained by restricting \i to 5Fa- 

Let A n denote the box [0, kn — l] d C Z d , where k is chosen so that kZ d C <C. When 
A is a finite measure on E, a standard definition of the specific free energy of an ordinary 
(not gradient) shift-invariant measure on (Q, 3) is the following limit of normalized relative 
entropies: 

lim \A n \- l %(fi An ,e- H ^\^\). 

71— >00 

We will use a similar approach in our gradient setting except that we will only compute 
relative entropies with respect to the subalgebra of gradient measurable sets. That is, if /i 
is a ^-invariant measure on fi, we write: 

SFE(fx)= lim lA^-^G^e-^A^- 1 !), 

n— >oo 

which we interpret as follows: Fix a reference vertex vq e A n and let v±, . . . ,f|A n |-i be an 
enumeration of the remaining vertices. In this context, by A' A ™' _1 we mean the measure v 
on {Cl,3\ ) such that for any measurable A C £jl Aw l -1 ) the value 

Ki>IO0l) -0(«o),0(«2) -0(^o),---,0|A„|-l -0(«o)) € A}) 

is equal to the measure of A in the product measure A' An ' _1 . (The reader may check that 
this definition is independent of the choice of v .) 

Also, when \i is a gradient measure — only defined on 3 T — then we write fi\ to mean 
the restriction of \x to 3^. The latter is also the smallest a-algebra with respect to which 
4>(v) — 4>{vq) is measurable for each v G A, so we can think of fi\ as a measure on this |A| — 1 
dimensional space, E^ -1 . In this context, the expression :K(/i An ,e~ H A„Al A "-i|) makes sense. 

As a convenient shorthand, we also write FEf(/j,) = !K(ha, e~ H AA' A ' _1 ) and refer to this 
as the free energy of [i restricted to A. (We occasionally drop the $ from FE\ when the 
choice of potential is understood.) Let Z\ be the integral of e~ H ° K over entire space E^^ 1 
of gradient functions, as described above, and refer to this as the free boundary partition 
function of A with respect to $. It is clear that — at least for perturbed simply attractive 
models — this value is always finite. 

Moreover, from Lemma 12.2.11 it follows that FE^(fi) > — logZ^ for all \l. When the 
choice of $ is understood, we write W(A) = —\ogZ A for this minimal free energy. We say 
that a potential $ is positive if $a(0) > for all A and all 0. 

Lemma 2.3.1 Suppose that $a is a positive potential and that A x and A 2 are finite connected 
subsets of 7j d (of finite weight with respect to that have exactly one vertex w in common. 
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Then W(Ai U A 2 ) > W(Ax) + W(A 2 ). Furthermore, for every measure fi on (fi,9 r A 1 uA 2 ) we 
have: 

FE AlUA2 ({i) > FE Al (fx) + FE A2 (ji). 

Proof We take w to be the reference vertex vq for Ai, A2, and Ai U A2. Write X = E^ 1 ^ 1 
and Y = E^' 1 and view £ , l A i uA al -1 as X x Y . Then #£ lUAa is equal to H° Al + the 
sum of $a' over all finite subsets A' of Z d such that A' C Ai U A 2 but A' (£. A\ and A' (£. A 2 . 
Since the latter sum is positive, we have 

-^AiUAa > #Ai + #A 2 > 

and the lemma is now an immediate consequence of Lemma 12.1.41 and 12.2.21 I 

In some cases, this gives us a useful lower bound on the free energy in a set A. 

Corollary 2.3.2 Suppose that W(e) > a whenever e is a pair of adjacent edges in Z d . Then 
for any A, W(A) > (|A| — l)a. In particular, FEf(fi) > (|A| — l)a for any finite, connected 
subset A of Z d . 

Proof Apply Lemma f2.3.1l to the |A| — 1 edges in any spanning tree of A. I 

We denote by be the space of positive L x r-invariant potentials $ for which W(e) 
is finite for every edge e. This is a convenient class of potentials in which to state a few 
lemmas; most importantly, for this purposes of this paper, every ^-invariant perturbed SAP 
is in Since $ is ^-invariant, W(e) is also L invariant and hence assumes only finitely 
many values on edges e in 7L d . Thus, the above corollary applies to all potentials in D^, 
which leads us to another corollary. Let e(A) be the number of edges (i.e., pairs of adjacent 
vertices) in the set A and let a' = inf{a, 0}, where a is as defined above. 

Corollary 2.3.3 Fix $ in X>£ and suppose \i is a L-invariant measure on {VL 1 'J T ). Then 
the value 

FE'M = FEt{n) - e{A)a' 

is superadditive in the sense that if Aj and A 2 are disjoint but at least one vertex of A x is 
adjacent to a vertex of A 2; then FE' AiUA2 (fi) > FE' A (n) + FE' A2 (n) for any measure \i. 

Proof Let e = (x, y) be the edge with x e A 1 and y G A 2 and apply Lemma 12.3.11 twice, 
first to the pair A 1 and {x,y}, and then to the pair A x U {x,y} and A 2 . Then the above 
follows from the fact that FEf(n) > a' and e(A 1 U A 2 ) > e(Ai) + e(A 2 ) + 1. I 

Since FE' e (fi) is positive for any edge, it is positive for any connected set A. By Corollary 
12.3.31 this implies that FE' A is increasing as a function of the connected set A: that is, 
FE' A (fi) > FE' A ,(n) whenever A' C A and both A and A' are connected. Of course, the 
FE' defined above is also invariant under £. It is not hard to see that there exists an N, 
depending on &, such that if w is any vector of integers in Z d , then Nw G £. Now consider, 
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as a function of w, the value FE' A (^i) where A w is the set of integer vectors a e Z d with 
< cij < u>jA r for 1 < z < d. The above corollary implies that if q = a, + 6j and all 
other coordinates of a, 6, and c are equal, then -F-Ea c (/-0 — -^-^Aa (/-O + FE'a (a 4 ) f° r every 
^-invariant \x. This is the usual definition of superadditivity for functions on Z d , and the 
following lemma follows by standard methods (see, e.g. Lemma 15.11 of [IB]): 

Lemma 2.3.4 J/$ 6 D^,, £/ien iTie faZue SFE' Aw (jj) = \A w \~ 1 FE' Aw (fi) witt F£" as defined 
above, tends to a unique limit, SFE'(fi) in [0, oo] as the coordinates of w tend to oo. 

Now, we write SFE(n) = SFE'(fi) + da' and SFEa(h) = ujFE^fi). It is not hard 
to see that SFE Aw (n) converges to SFE(n) as w tends to oo. We refer to SFE(fi) as the 
specific free energy of \i. 

Note that the limits used in the definition of the specific free energy assume that we 
have chosen a specific lattice £: we might write SFE L (fi) to denote the specific free energy 
with respect to the lattice £. However, it is clear that if fi is ^-invariant, then when £ is 
replaced by a full rank sublattice £', the limits are not changed, so SFE L (/j) = SFE L ' (fi). 
Similarly, if /j is invariant with respect to any two full rank lattices £ and £', then we have 
SFE L (n) = SFE LnL '(fi) = SFE L '(fi). Next, using the £-invariance of \x and fact that 
FE' A (n) is increasing in A, we have FE' Aw (fx) < FE' Aw+v (9 x n) < FE' Aw+2v (n), where v 
is the vector with all of its coordinates equal to 1. Taking limits and using Lemma 12.3.41 
it follows that SFE L (fi) = SFE L (8 x fi), where 9 X is any translation of Z d (and x is not 
necessarily in £). We state these facts lemma: 

Lemma 2.3.5 If £ and £' are two full-rank sublattice ofL d and \i is both L-invariant and 
L' -invariant, then SFE L (/j,) = SFE l '(/j,). Moreover, SFE l (/j,) = SFE L (6 x (/i)) for any 
xeZ d and any fi e 9(Q,3^). 

We have defined SFE L (n) for all fi G J , ^(f2,5" r ); it is often convenient to extend the 
definition to all of 3" T ) by writing SFE L (fi) = oo whenever /x lp£,(0, 9 rr ). We will 
generally write SFE(fi) for SFE ' (//), assuming the choice of lattice to be clear from the 
context. 

Next, it will often be useful to us to have lower bounds on the specific free energy of /z 
in terms of the free energies FE A (fi). Since a' < 0, we have the following bound for any w: 

Lemma 2.3.6 SFE(fi) - da' > SFE'(n) > SFE' Aw (fi) > SFE Aw (/2) = |AJ- x F£ A J/i) 

We can derive a similar bound involving FE A for a non-rectangular set A. Let A be a 
connected subset of A w . Then, using repeated applications of Lemma [2.3.11 we can show 
that FE Avj (fi) > FE A (n) + a (\A W \ - |A|). Thus, we have 

SFE(ji) > (A^r 1 (FE A (ji) + a (\A W \ - A)) . 

In particular, we can say the following: 



27 



Lemma 2.3.7 For each A C Z d , there exist constants C\ > and C2 st/cn i/iai SFE(fi) > 
C 1 FE A ( f i) + C 2 . 



We can use this fact to check one more important result: 
Lemma 2.3.8 For every constant Cel, there exist: 

1. A C\ such that SFE(fjt) < C implies \i (V Xj y(<p(y) — (f>(x))) < C\ for any adjacent pair 
(x, y) in 1* d . (In fact, we can write C\ = aC + b for some constants a and b.) 

2. A C 2 such that SFE(^) < C implies \S(fi)\ < C 2 

Proof First, suppose x and y are fixed. For an appropriate C, we can use Lemma I2.H.7I 
to show that SFE(fjL) < C implies FE{ X)y y(fjL) = Jt(ji{ x>y y, e~ v *> v \) < C . Writing r) = 
4>(y) — 4>{x) and letting / be the Radon-Nikodym derivative of fi x>y with respect to E m , we 
can write the latter expression as 



for all probability densities /. Taking the difference of the leftmost terms in the preceding 
two equations, we conclude that J fV X! y(rj)drj < 2(C — (3). Finally, we can compute this last 
expression for an edge (x, y) in each of the equivalence classes of edges modulo L and let C\ 
be the infimum of these values. 

Next, if SFE(fi) < C, then \x must be ^-invariant; thus, to derive a uniform bound on 
S(fi), it is enough to derive a uniform bound on \fj,(<fi(y) — <fi(x))\ < fj,(\<fi(y) — <j>(x)\) for each 
adjacent pair (x,y) in Z d . Since V Xi y(rj) increases at least linearly in \rj\, the existence of a 
uniform bound on fi(\<f)(y) — 4>(x)\) follows immediately from the first part of this lemma. 



2.4 Specific free energy level set compactness 

Now that we have the specific free energy for measures in we can begin to discuss its 
properties. One of the most important concerns the level sets M c = {p,\SFE(fi) < C}, as 
subsets of TcXfi, 3 rr ). Define the topology of local convergence on 7(Q, 3 rT ) to be the smallest 
topology in which the maps /i 1— > //(/) are continuous for every bounded, gradient cylinder 
function / (i.e., every bounded function that is ^-measurable for some A CC Z d ) from SI 
to R. 




(Here drj is understood to mean d\(rj).) By Lemma [2.1.11 there is a f3 such that 
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Lemma 2.4.1 Each level set Mc of 3 rT ), endowed with the restriction of the topology 
of local convergence to that set, is a metric space (i.e., the topology of local convergence 
restricted to Mc can be induced by an appropriate metric). 

Proof Let {Aj} be an enumeration of the connected finite subsets of Z d . Let 8i([i,u) 
be the distance between the restrictions /j,^ and v^ 3 in the metric for the weak topology 
on y(f2,9 r ] v ). Then 5(n,v) = ^°^ 2~* inf (1, <5j(/i, z/)) is a metric for the topology of local 
convergence on Mc- It is clear that // converges to /i in this topology if and only if fj, l A 
converges weakly to fi A for every j. I 

We next prove that Mc is also compact (and hence Polish): 

Theorem 2.4.2 The level sets Mc are closed and sequentially compact in the topology of 
local convergence on t ?(Q,3 rT ). Being metrizable, they are thus compact, and hence Polish 
(i.e., complete and separable) metric spaces for this topology. 

Proof First of all, if we are given any sequence {//} of measures in Mc, then by Lemma 
12.3.71 Lemma 12.3.11 and Lemma 12.1.51 we can, for any fixed Aj, find a subsequence of // 
on which the restrictions fi A . converge in the r-topology to a fixed probability measure v A . 
on (f2, 9^ ). By a standard diagonalization argument, we can take a subsequence such that 
converges in the r-topology for each of the countably many sets Aj to some probability 
measure .. 

Kolmogorov's extension theorem for the Polish space (Q, 3 rr ) then implies that there 
exists a measure v G 3 rr ) whose restrictions to the Aj are in fact these measures. It is 
clear that this v is a limit point of the /Xj in the topology of local convergence; every A CC Z ' 
is contained in some Aj, and hence every cylinder set A G 3^ is also contained in some 3^ 
on which converges to v^ y Since the restrictions v^. are clearly ^-invariant — and the 
sets generate "J — it follows that v is ^-invariant. Moreover, we must have SFE(v) < C. 
If this were not the case, then by Lemmas 12 . 3 . 61 and 12 . 3 . 41 we would have SFE' Aw (u) > C — dct' 
for some w. Now, Lemma [2.1.51 implies that there must be a // with SFE' A .(ii l ) > C — da' 
(for Aj = A w ). Applying Lemma \2 .3 .61 this implies that SFE(fi l ) > C, a contradiction. I 

2.5 Minimizers of specific free energy are Gibbs mea- 
sures 

Lemma 2.5.1 Whenever $ e T)^, there exists an S F E® -minimizing measure in P&(fl, 9 rr ), 
i.e., a measure fi in y Cj (n,9" r ) such that, for any other measure jj, G T^(fi, 9 rr ), we have 
SFE*(n) > SFE*(no)- 

This minimal value is sometimes called the pressure of $ and denoted -P( ( &). 

Proof Note that r\c>p(<s>)M c is an intersection of non-empty, decreasing compact sets; 
hence, it is nonempty. I 
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The following is the easy half of our variational principle. It is not hard to prove this result 
in more generality, but for simplicity we will describe only the perturbed simply attractive 
case. 

Theorem 2.5.2 Let $ be a perturbed simply attractive potential. If fi has minimal specific 
free energy (with respect to <&) among L-invariant measures with slope u, then fi is a Gibbs 
measure. 

Proof Suppose that fi is an XL-invariant measure with finite specific free energy and slope u 
that is not a Gibbs measure. We will show that in this case it is always possible to modify 
/i to produce a measure ~p with slope equal to u such that SFE^ijl) < SFE^^fi). 

If /i is not Gibbs, then for some A, we have fi^A 7^ fi, and hence 3"C(/i, /U7a) — D > 0. 
Since $ has finite range, there is an integer r such that <3>a = whenever A CC Z d contains 
two vertices of distance r or more apart. 

Now, let XL' be a sublattice of XL such that for any non-zero i G XL', each vertex in A is at 
least distance 2r from each vertex of A + i. Then we define our modified measure: 

~p> = v n 7a +*- 

Although the composition on the righthand side is infinite, by choice of A, the kernels 7a+« 
commute; hence, the order in which the kernels are applied does not matter. Moreover, the 
infinite composition converges in the topology of local convergence, since every A' intersects 
only finitely many A + i sets. Informally, it is easy to see why this measure has lower specific 
free energy than \i: applying of 7a+« increases the free energy contained in supersets of A + i 
by some fixed amount. So, naturally, applying the kernels at a positive fraction of offsets in 
Z 2 should increase the "free energy per site" by a positive amount. The formal proof that 
follows is not very different from well-known proofs of standard (non- gradient) analogs of 
this lemma. 

Now, as before, take A n = [0, n — l] d C Z d . Choose n large enough so that each connected 
component of A n = A n n (XL' + A) is completely contained in A n and has all of its vertices 
at least r units from the boundary of A n . (If necessary, by Lemma \2. 3. 51 we may replace fi 
with 8 v fi and A with A + v for some v G Z d in order to make this possible.) Fix a reference 
vertex v G A n \A„. We can decompose A' A ™' -1 into the product Al A "' - ' A "l~ 1 £g> A' A ™' by 
taking pairs into the product space to have the form (x, y) where the components of x are 
the values <j>(v) — 4>{vq) for v G A n \(A n U {fo}) and the components of y are 4>{v) — 4>(vq) for 
v G A n . For convenience in this proof only, we write fiQ = e~ a„aI a ™I _1 . By Lemma 12.1.31 
there exist regular conditional probability distributions /ig, Ji x , and fi x on J T An describing the 
distribution on y given the value x, when (x, y) has distribution fj, , /Z, and fi, respectively. 

Now, we claim the following: 

FE An (fi) - FE An (jl) = XQM^to) - X(pA n ,V<a) = V (^(^, mS) - ■ 

The first equality is true by definition. The second holds follows from Lemma \'2. 1.31 and the 
fact that A*A n \A n = i"a„\a„- By our choice of XL', we know that both fi x and ~p x are (fi almost 
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surely) products of their restrictions to the components A^ of A n . Thus, by Lemma 12.1.41 
we have that 

Note that 71^ = /igU* ■ Hence, !Kg^ = 0. Moreover, the x marginals of /i and /Z 

coincide, hence 

A n A n A n 

where the last equality uses the £-invariance of /i. For large n, the sum is D times the 
number of components A^ contained in A n . It follows that 

SFE L \ri - SFE L '(JI) = lim -L(F£ An Gu) - FE An (jl)) > D/I, 

n^oo \A n \ 

where / is the index of £/ in Z d . 

While ~p is not necessarily J£> invariant, it is £/-invariant. We can make it iL-invariant by 
replacing it with an average ~p! over shifts by elements of L modulo £/. By Lemma \2. 1.11 and 
the definition of the specific free energy, this averaging can only increase the specific free 
energy. By Lemma 1^.3.81 /Z has finite slope; since \i and ~fi have the same laws on 3£/ +A , it 
follows from the definition of slope on £/-invariant measures that that S(jl) = S(fi). Since 
~p' is an ^-invariant measure that is an average of finitely many measures of this slope, it is 
also clear that S(Jl') = S(fi). I 
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Chapter 3 



Ergodic/extremal decompositions and 
SFE 



In this chapter, we will cite several standard results about ergodic and extremal decom- 
positions that we can apply to measures in J'(f2,3' T ); in particular, we will see that every 
^-invariant gradient measure \x can be written, in a unique way, as a weighted average of 
£-ergodic gradient measures. Moreover, we can compute SFE{p) as the weighted average 
of the specific free energy of the £-ergodic components. The latter result is well known (see 
Chapter 15 of 43J) for ordinary (i.e., non-gradient) Gibbs measures on Q = E z when E 
has a finite underlying measure. However, we must check that this result remains true for 
gradient Gibbs measures and the specific free energy we have constructed in this context. 
Throughout this chapter, we assume that a perturbed simply attractive potential $ is fixed; 
gradient Gibbs measure and specific free energy are defined with respect to this $. 

3.1 Funaki-Spohn gradient measures 

In this section, we describe an alternate (but equivalent) formulation of gradient measures 
(which is also described in detail in a work of Funaki and Spohn 00] )• The difference between 
the formulation of jlHl and our formulation is largely cosmetic. For the purposes of this text, 
ours is more convenient; however, the results about £-ergodic and extremal decompositions 
described in and apply more directly to the formulation of j3U] than to ours. 

The main issue is that several of the basic facts that we will need about extremal and 
ergodic decompositions of gradient Gibbs measures (namely, Lemmas 13.2.11 13.2.21 13.2.31 
13.2.41 and 13.2. 5|) have only been stated and proved in the literature (for example, in the 
reference text [13]) for ordinary Gibbs measures. Although these results are not terribly 
difficult, reproving them individually in the gradient Gibbs measure context would consume 
a good deal of space and provide little new insight. 

Instead, we will make a straightforward observation (following [UJ|) that the laws of 
the gradients (defined below) of functions sampled from gradient Gibbs measures are Gibbs 
measures — not with respect to an ordinary Gibbs potential, but with respect to a so-called 
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specification, described below. Focusing on these gradients will enable us to cite the above 
mentioned lemmas directly from 03] instead of proving them ourselves. 
For any function G Q, we define the discrete gradient V0 : Z d i— > E d by 

V0(x) = (0(z + ei) - 0(z), . . . , <p(x + e d ) - <j>{x)), 

where the are basis vectors of Z d . Write E = E d . Denote by Q the set of functions from 
Z d to E and by 9" the Borel a-algebra induced on Q by the product topology. Since V0 only 
depends on the value of up to an additive constant, each measure /i on (Q, 9* r ) induces a 
measure /Z on (O, 9). 

A function -0 G f2 is called a gradient function if it can be written V0 for some G Q. 
In the authors characterize the gradient functions as those functions satisfying the 
"plaquette condition" 

i)(x)i + i/;(x + ej)i = if)(x)j + if)(x + e^j 

for all 1 < i,j < d and x G Z d . (Here ip(x)i denotes the ith component of i/^(x).) Denote by 
C ft the set of gradient functions. 

Instead of taking — as we do — the configuration space to be Q and using a a-algebra IF" 
that only measures properties of functions that are invariant under the addition of a global 
constant, Funaki and Spohn use Q as their configuration space and stipulate further that all 
the measures they consider are supported on Q G (i.e., satisfy the plaquette condition almost 
surely) . 

Define the topology of local convergence on 7(Qg, 9\) to be the smallest in which \i \— > fi(f) 
is continuous for every bounded cylinder function / : Q G \— > E. This is analogous to our 
definition of the topology of local convergence on y(Q, 3 rT ). The reader may easily verify the 
following: 

Lemma 3.1.1 The map \l\-^JL described above gives a one-to-one correspondence between 
3 rr ) and "P^g, 90 C (P(0, 9"). Moreover, the topology of local convergence on 3 rT ) 
(as defined in the previous section) is equivalent to the topology of local convergence on 
restricted to 9(Tl G ,$). 

We extend the definition of specific free energy to 7(Qg, 9") by writing SFE(JZ) = SFE(p) 
whenever \i G 7(Q,3 rT ). Citing Lemma [2.4.21 and Lemma f2.4. 11 we have: 

Corollary 3.1.2 SEE is a lower semi- continuous function on 9) with respect to the 

topology of local convergence; moreover, the level sets M G = {Jl\SEE(JI) < C} are compact 
and metrizable. 

If A G 9* T , denote by A the corresponding subset of Qg- Then we can extend the kernels 
7*(A,0) (defined in the first chapter) to this context by writing 7*(A,V0) = 7* (A 4>)- 
(Since the latter term is unchanged when a constant is added to 0, the kernels are well 
defined.) 
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We would like to argue that, in some sense, ~p is a Gibbs measure if and only if it is pre- 
served by these kernels (or, equivalently, if the corresponding measure \i is a gradient Gibbs 
measure). However, the following fact suggests that this is impossible with the definition of 
Gibbs measure we presented in the introduction: 

Proposition 3.1.3 When E = MJ 71 and $ is a Gibbs potential (which admits at least one 
Gibbs measure) there exists no Gibbs potential $ such that fi 6 S*(^, 3^) if and only if 



Proof If fx e S*(^, 3), and i/j is sampled from /Z, then by definition, the law of ip(x) — given 
the values of ip at the neighbors of a; — is absolutely continuous with respect to the underlying 
measure on E, with Radon-Nikodym derivative given by the Hamiltonian H^ijp). On the 
other hand, if JI is supported on gradient functions, then ip(x) is completely determined (by 
the plaquette condition) from the value of ip at the neighbors of x; thus the conditional 
distribution of ip(x) is supported on a single point. The only way the above conditional 
measure can be absolutely continuous with respect to an underlying measure A on E is if A 
has a point mass at that point. This cannot happen if A is Lebesgue measure. Moreover, 
switching to another underlying measure A does not solve the problem. Since the law of 
V0(sc), when cf) is sampled from a gradient Gibbs measure, is absolutely continuous with 
respect to Lebesgue measure, the measure A would have to point masses on a set of positive 
Lebesgue measure (and in particular could not be a-finite). But any finite measure of the 
form e~ H ^ YlxeA d\(i[)(x)) — in which the individual random variables ip(x) are supported 
on point masses of A — is supported on point masses of the product n^eA \ an d hence is 
supported on a countable set. This is not the case for general A when ip is the gradient of a 
gradient Gibbs measure. I 

We will now expand our definition of Gibbs measure. First, if (X, X) is any probability 
space and 33 a sub-cx-algebra of X, then a probability kernel 7r from 2 to I is proper if 
ti{B\-) = 1b for each B G B. The Gibbs re-randomization kernels 7a from 7\ to J or from 
7\ to 9 rr , as defined in the introduction, are examples of proper kernels. 

Most of the theorems in [33] are proved for a more general class of families of proper 
Gibbs re-randomization kernels called specifications (in the sense of sections 1.1 and 1.2 of 
[33]; see also HO]) on H G . The following is Definition 1.23 of [13]: 

Definition: A specification on Z d with state space (E, £), is a family 5 = {^aIaccz 1 * °f 
proper probability kernels 5a from 7\ to 3 which satisfy the consistency condition 
<5a<5a = 5a when Ac A. The random fields in the set 



S(S) 



{H e 3>(ft,2) : fx(A\7 A ) = 5 A (A|-Va.s.} 
for all A e 3 and A CC Z d 



(3.1) 
(3.2) 



are called Gibbs measures with respect to the specification 5. 
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Let A' be the set A fl + Y2t=i e « J ■ Note that if the V0 is known at all x ^ A, and the 
value of <p is known at some x E" A, then we can deduce the value of <fi at x E" A'; however, 
this information tells us nothing about the value of at vertices in A'. 

Now, define 5\ to be the kernel from (fl, 7\) to (ft, 3") corresponding to the kernel 7*, 
on (O, IP"). The reader may verify that these kernels form a specification on Z d with state 
space (E,~£) := (E d ,E d ). 

The following statement follows immediately from the definitions: 

Lemma 3.1.4 fx is a Gibbs measure (ergodic measure, extremal Gibbs measure) on (O, 3 rr ) 
(with respect to the Gibbs specification 7* ) if and only if fx is a Gibbs measure (resp., ergodic 
measure, extremal Gibbs measure) on (0, 5F) with respect to the specification 5. 

3.2 Extremal and ergodic decompositions 

In this section, we cite several results from Chapter 7 and Chapter 14 of |HJ| (e.g., existence 
of extremal and ergodic decompositions), all of which apply to both ordinary measures and 
gradient measures. Throughout this section, we assume that a gradient potential $ is given. 
In each case, the gradient analog of the statement follows from the cited non-gradient result 
by the correspondence described in the previous section. 

As in the first chapter, we say a measurable subset A of Q is a tail event, if A E T = 
nAccz d 3 r z d -A- We say A is an L-invariant event if it is preserved by translations by members 
of £; both 7 and the set of ^-invariant events are a-algebras. (See Proposition 7.3, 
Corollary 7.4, and Remark 14.3 of [HI].) We denote the sets of extremal and ergodic Gibbs 
measures by exS(fi, 7) and exS^(fi,3 r ), respectively; we sometimes abbreviate 3(^,90 and 
Qz(£l, 3) by S and Sz, respectively. Similarly, the set of extremal and ergodic gradient Gibbs 
measures are written, respectively, ex3(^,9 rr ) and exS&(Q, 3 rT ); we abbreviate S(^,3 rr ) and 
Sl{n,? T ) by S T and 

We say fx is trivial on a a-algebra A if fx(A) G {0, 1} for all A E A. Now, we cite the 
following characterization of extremal and £-ergodic measures in terms of their behavior on 
tail and ^-invariant events, respectively (Theorems 7.7 and 14.5 of [43J): 

Lemma 3.2.1 The following hold for all Gibbs measures on (CI, 3) 

1. A probability measure fx E J > ^(f2,3 r ) is extreme in ^^(Q,^ if and only if fx is trivial 
on 3 Z . 

2. A Gibbs measure fx E S is extreme in S if and only fx is trivial on T. Distinct extremal 
Gibbs measures fx\ and fx^ are mutually singular in that there exists an A E T with 
fXi(A) = and fx 2 (A) = 1. 

3. A Gibbs measure fx E Sl is L^-ergodic if and only if fx is trivial on 3^. Distinct £- 
ergodic measures are mutually singular in that there exists an A G 3t with fXi(A) = 
and fi2(A) = 1. 
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Analogous statements are true for gradient measures // G < J > £ J (Q,3 rT ) . 

There may exist extremal Gibbs measures that are not £-ergodic and £-ergodic Gibbs 
measures that are not extremal. However, as the following lemma shows, every extremal 
iL-invariant measure is necessarily £-ergodic. (Theorem 14.15 of |43j.) 

Lemma 3.2.2 A Gibbs measure \x G Sl is an extreme point of the convex set 9& if and only 
if ji is L-ergodic. That is, 

^ = S t n^(0,J). 

Furthermore, Sc is a face o/T,c(n, 5F). That is, if fi,u G ^(fi, and < s < 1 are such 
that S[i+(l — s)v G Si:? then 9c- Analogous statements are true for gradient measures. 

Given a single observation from an extremal Gibbs measure or an £-ergodic measure 
/i G 9 r ), it is /i-almost-surely possible to reconstruct /i from a way that we will now 
describe. Whenever 7a„("|0) has a limit in the topology of local convergence as n tends to 
oo, we denote this limit by 7r*. Let {A n } be any increasing sequence of cubes in Z d such 
that |A„| — ► oo. We denote by the "shift-averaged" measure given by 

4(A) = hm lA^n^r 1 V i A (e x <p) 

xeA„n£. 

when this limit exists. We can extend the functions 7r and ir L to functions from Q to 9(^, 3^) 
and respectively, by setting them equal to some arbitrary u in (respectively) 

9(fi,3 r ) and Tc(fl, when these limits fail to exist. The following lemma makes precise 
our ability to recover /i from a single observation. (The first half is jlHj, Theorem 7.12. The 
second follows from [13], Theorem 14.10.) 

Lemma 3.2.3 The following are true: 

1. If [L G exQ, then fi({<f> G Q : 7T* = //}) = 1. 

2. ///iG ar3 , ,c(fi,3 r ), ^en //({<£ G : ttJ = //}) = 1. 
Analogous statements are true of gradient measures. 

Next, we would like to say that each measure in 9 (respectively, Sl) is a weighted average 
of extremal (respectively, ergodic) measures. In order to precisely define a "weighted average" 
of elements in ex9 and ex9,c, we need to define cr-algebras on these sets of measures. To 
do this, for each i 6 J, consider the evaluation map ca '■ At i— »■ fi(A). Denote by e(ex9) 
the smallest a-algebra on ex9 with respect to which each ca is measurable. Define e(ex9,c) 
similarly. The following decomposition theorem shows that 9 and 9,c are isomorphic to the 
simplices of probability measures on ex9 and ex9,c, respectively. (See [43J, Theorem 7.26 
and Theorem 14.17.) 
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Lemma 3.2.4 For each \i G 9 i^ere exists a unique weight G CP(ea;9, e(exS)) snc/j i/jai 
/or eac/i AG?, 



27ie mapping v i— > is a bisection between 9 and T(ea;S, e(ea;9)). Furthermore, has the 
same law as the image of fi under the mapping <fi i— > 7r^. These results remain true when 
9 replaced by Sl and is replaced by tc^. Analogous decompositions exist for gradient 
measures. 

Lemma 3.2.5 For each fi G T^Q,^) there exists a unique weight 



The mapping fi ^ is an bijection between T^Q,^) and 3 (ea^^(f2, e(ea^P(Q, ?))). 
Furthermore, gives the law for the image of fi under the mapping (f> i— ► ix^. Analogous 
decompositions exist for gradient measures. 

In less formal terms, the lemmas state that sampling <p from fi G 9 is equivalent to: 

1. First choosing an extremal measure fiQ from an extremal decomposition measure. 

2. Then choosing <ft from fi . 

Similarly, sampling <ft from fi G ^(fi,?) is equivalent to: 

1. First choosing an £-ergodic measure fiQ from an ergodic decomposition measure. 

2. Then choosing <ft from fiQ. 

Note that an £-ergodic Gibbs measure fi G 9,c ma Y or ma Y n °t be extremal. In Chapter |H] 
and El we will be interested not only in classifying £-ergodic gradient Gibbs measures but 
also in determining how each £-ergodic gradient Gibbs measure decomposes into extremal 
components. 



deviations principle and the variational principle. A longer, more detailed proof of this 
result — which follows Chapter 15 of [43] — is given in Appendix [X] An advantage of the 
longer proof is that it also yields results of independent interest, including one interpreta- 
tion of SFE(fi) in terms of the "conditional free energy" of one fundamental domain of 
£ — conditioned on its "lexicographic past" — and another interpretation involving discrete 
derivatives. We do not use these interpretations anywhere else (hence, their relegation to 
the appendix). The proof described here follows [26J. 




G y(e£P £ (^,9 r ),e(e^(fi,3 r ))) 



such that for each AeJ, 





w^SFE) is essential to our proof of the large 



37 



Lemma 3.3.1 The function p i— > SFE(p) is affine. 

Proof We follow the proof given in Exercise 4.4.41 of [26J. Suppose p = ap + (1 — a)u with 
< a < 1. Recall the definition 

SFE(p) = lim lA^I" 1 lim FE^ n (p) = \A n \~ l X(p An , e~ H ^ A^ 1 !). 

n—*oo n— »oo 

For this proof only, denote 7r n = e~ H ^n\\^ n ~ l \, Taking the limit, the convexity of SFE 
follows immediately from the convexity of IK(-, 7r„). To prove concavity, it is enough to show 
that 

tt-Mp, *n) > j^-r (aX(/i, 7r n ) + (1 - a)X(v, tt„)) + o(l). 

| -< *-n | | 'HI I 

If either of /^a„ or z^A n fails to be absolutely continuous with respect to n n , then both sides of 
the inequality are equal to infinity. Otherwise, let f n and g n be the Radon-Nikodym deriva- 
tives of /!a„ and v/v n i respectively, with respect to 7r n . Then we can rewrite the inequality: 

-vr n {{af n + (1 - a)g n ) log(a/ n + (1 - a)</„)) > 

-7r n {af n log /„ + (1 - a)5( n log5f n ) + o(|A n |). 

In fact, a simple identity (see Exercise 4.4.41 of [26]) states that if f n and # n assume any 
values in [0, oo) and < a < 1, then 

I fn 9n | 



-(afn + (1 - <% n ) log(a/ n + (1 - a)g n ) > - (af n log f n + (1 - a)g„, log p n ) + 

Since 7T n (/ n ) = 7r n (g n ) — 1, the integral of the "error term" (the last term on the right in the 
above expression) is at most 2; in particular, it is o(|A n |). I 

In fact, S-F-E is also "strongly affine" in the following sense: 

Theorem 3.3.2 If p can be written 



then 



p= vw^du) 



SFE(p) = [ SFE{v)w a {dv) = w a (SFE). 

J exV £ (QS T ) 



The above Theorem clearly follows from Lemma 13.3.11 Lemma 13.1.21 and the following 
lemma (applied to the gradient field /Z). 

Lemma 3.3.3 If F : 7^(0,,^!) h- > [0, oo] is lower- semicontinuous (with respect to a topology 
in which the level sets {p : F(p) < C} are metrizable) and affine, then it is strongly affine. 
That is, 



vw u (dv) = / F{v)w a {dv) 
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Proof The proof is identical to the proof of Lemma 5.2.24 of f26| . 



Corollary 3.3.4 The function a : O i-> 1, defined by a(4>) = SFE(irfJ is bounded below, 
and satisfies 

SFE(fi) = fi(a) = fi(SFE(n L )) 

for all fie y^n,?). 

Proof By Lemma I3.2.3[ the function /i i— > agrees with the function fi > SFE(fx) 

when n is £-ergodic. Since a is e(Tc(c<;, J 1 ')) measurable (it is a limit of the e(T^(o;, "J T ))- 
measurable functions fi i— > \A n \ n FE\(fi n )), it follows from Lemma [3.2.41 that the statement 
proved in Theorem 13 . 3 . 21 for \i \— > SFE(fi) applies to the function fi i— > as well. I 
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Chapter 4 

Surface tension and energy 



Define the surface tension a : Ml 



dxm 



i— > R by writing 



inf 



SFE(n). 



We will give another equivalent definition of surface tension (as a normalized limit of log 
partition functions on tori) in Chapter |H1 In this chapter, we will make several elementary 
observations about the function a and the set of slopes on which a is finite. We also 
discuss some basic facts about the existence of finite energy functions satisfying boundary 
conditions. Further discussion of this and related problems can be found in many texts 
on linear programming and network flows; see , e.g., [2]. Unless otherwise noted, we will 
assume throughout this chapter, for notational simplicity, that m = 1. The extensions to 
higher dimensions m > 1 are in all cases straightforward. 

Throughout this chapter, we assume that an ^-invariant perturbed simply attractive 
potential $ is given. Denote by £/$ the interior of the region on which a is finite. Since SFE 
is affine, it is clear that o is convex; in particular, this implies that U§ is a convex open set 
and that o is continuous on U$ and (provided U$ is non-empty) equal to oo outside of the 
closure U$> of U$. We refer to Uq> as the space of allowable slopes. Some of the results in 
subsequent chapters (including the second half of the variational principle) will apply only 
to slopes in Uq>. In the next section, we make some preliminary constructions that will be 
necessary in the continuous cases E = K and E = 1R" 1 , and which we will apply mainly to 
the case that $ is a perturbed ISAP. 

4.1 Energy bounds for ISAPs when E = R 

4.1.1 Bounding free energy and surface tension in terms of V 

We will assume throughout this section that m — 1 and V : M i— > M + is a convex, symmetric 
(i.e., V(x) = V(—x)) difference potential, and that $y is the corresponding ISAP. We begin 
with following question: what does the shape of V tells us about the shape of cr? 
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To begin to answer this, we extend the definition of V to vectors by writing V(u) = 

gives the specific energy of the deterministic singleton measure 
fi G T(fi, 9 r ' r ) supported on the plane (j) u {x) = (x, u) (defined up to additive constant). Thus, 
if we replaced SFE by SE in the definition of a, we would have exactly a{u) = V(u) (by 
Jensen's inequality). 

However, SFE(fi) for the /i defined above is infinite, since the entropy of \i is — oo. 
Nonetheless, if \x is instead as the law of <fi = <ft u + <p', where the law // of <p' has slope 
zero, specific entropy loge, and \(f>i(x) — <j){y)\ < e /x'-a.s. (for some constant e; see the next 
subsection for the existence of such a measure //), then SFE(fi) < sup{V(u') : \u' — u\oo < 
e} — loge, where \u' — w|oo = supf =1 \ui — u'A. Applying this to a particular choice of pi yields 
the following: 

Lemma 4.1.1 When $y is an ISAP, E = M>, and V grows at most exponentially fast, there 
exist constants C\ > and C<i (depending only on V) such that cr(u) < C\V(u) + Ci- If 
is any ISAP, then a{u) < sup{K(u') : \v! — < e} — loge for all e. 

To get a bound in the other direction, note that SFE(fi) = a(u) for some /i with S(p) = 
u, and Lemma F2.3.8I then implies \i (V(<p(y) — (f>(x))) < C\a{u) + C2 for some constants C\ 
and C 2 when x and y are adjacent (with the constants depending on V, but not on /x, x, or 
y) which in turn implies (by Jensen's inequality) the following: 

Lemma 4.1.2 When $y is an ISAP and E — R ; there exist constants Ci > and Ci 
(depending only on V) such that V(u) < C\o-{u) + G%. 

Combining the lemmas implies that a and V generate identical Orlicz-Sobolev norms 
(discussed in detail in Chapter [SJ) when V grows at most exponentially fast. When V increases 
super-exponentially, however, this need not be the case. One of the most important cases 
in which the first part of Lemma [4.1.11 fails is the hard constraint model in which V(rj) = 
if 77 G [—1,1] and 00 otherwise. It is easy to see that o~{u) tends to 00 as u tends to the 
boundary of [—1, l] d , even though V(u) is constant for u G [—1, l] d . 

We will now define a function V, called the wedge-normalization of V such that V(u) 
and o~{u) do agree up to a constant factor. In the hard constraint model above, it would be 
natural to guess that the second estimate in Lemma 14.1.11 is tight, so that on the interval 
[— 1, 1], Vijf) should be approximately (some constant times) the log of the distance from r\ 
to the nearest of the endpoints —1 and 1. In fact, the particular expression for V(rj) we use 
is V(rj) minus four times the log of the distance from F{rj) to either or 1, where F(rj) is the 
fraction of the mass of the measure e~ v ^dr\ (where drj is Lebesgue measure on R) that lies 
to the left of rj. (The particular form is convenient because it allows us express free energy 
with respect to V in terms of relatively entropy with respect to certain "wedge measure," 
defined below.) 

In addition to obtaining information about a (which will later be useful in determining 
the topologies in which surface-shape large deviations principles apply), we will show that if 
(p : A — > R is such that the nearest-neighbor sum V(f)(x) — (f>(y)) is equal to C, then there 
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is measure describing a random small perturbation of whose free energy (with respect to 
V) is at most a constant times C. 

To motivate the construction of these random perturbations, we mention that they will 
be useful in Chapter [7| when we engage in the mechanical process of proving lower bounds on 
the Gibbs measure of the set of functions '■ D n — > K that approximate a particular function 
/ on D; one technique will involve constructing a 0o that approximates /, is piecewise linear 
on large pieces of D n: and has a V energy that we can control. Then we can get a lower 
bound by estimating the Gibbs measure of the set of that are "pertubations" of 0o i n the 
non-linear regions and allowed to vary more freely on the linear regions (which are dealt with 
separately using the various subadditivity limits of Chapter EJ), see Figure 177111 

The reader who is only interested in the case that V increases at most exponentially fast 
may skip the proofs of the lemmas in the remainder of this section (since in this case, one 
may take V = V and the results are still obviously true). The reader who is only interested 
in the case E = Z need not read this section at all. 

4.1.2 Defining box measures and V 

Given A, a box measure \x on the space of functions on A is a uniform measure on a set 
B = {0|0i < < 02} where 0i and 02 are also real functions on A. (We write f < g 
if f(x) < g(x) for each x G A.) An upper bound on the free energy of /x is given by 
sup^ eB H$((f>) — J2 n eA ^Sifaiv) ~ 4>i(yi))- 111 the remainder of this text, when we need to 
prove that a reasonably low (or at least non-infinite) free-energy Gibbs measure exists with 
certain properties, we will sometimes construct them explicitly using box measures. In this 
section we will construct a convex function V, based on V, and use the shorthand $ = $y 
and $ = Both $ and $ will be ISAPs. We derive an upper bound on FEf(fi) in 

terms of FEf(fi) and show that, for constants C\ and C2, there always exists a box measure 
centered at with $ free energy at most C*iiJ*(0) + Ci- The first part of our construction 
involves showing that the relative entropy of a measure v on [0, 1] with respect to a certain 
"wedge-density" measure is not much more than twice the relative entropy of v with respect 
to a uniform measure. 

Lemma 4.1.3 Let \i\ be the uniform distribution on [0, 1] and \xi the measure with density 
given by the wedge-shaped function g(rj) = 2 — 4 1 77 — ||. IfK(y,ni) is finite, then !K(z/, /i 2 ) 
is finite also and furthermore, for any C\ > 2, there exists an C2 for which 0i(u, /J2) < 
Ci!K(v, fMi) + C 2 for any Lebesgue-measurable v on [0, 1]. 

Proof Given /3, by Lemma I2.1.1[ we can compute the density function / on [0, 1] that 
minimizes j{f{ff) log f(rj)) drj — (3 J f(rj) log g(rj)drj: we use the fact that this expression is 
equal to !K(/x, v) + logC where \x has density / and v has density Cg 13 with C^ 1 = f g^(r])dx 
(which makes sense provided (3 > — 1). The minimum occurs when // = u, i.e., when / = 
(up to constant multiple). In particular, this minimum is finite when — 1 < (3 < 0. We can 
rewrite this statement (using the fact that (1 — f3) +(3 = 1) as follows: whenever —1 < (3 < 0, 
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there exists an a for which 

(1-/3) j f '(v) log f(v)dv + P {^f f(v)k>Sf(v)dV ~ J f(v)^gg(r))dr^j > a. 
Dividing by (3 (and recalling that (3 < 0) we have: 

J f( v ) log^y^ <^ + ^J f(v) kg f(v)dn 

By taking (3 close to — 1, we can make arbitrarily close to 2. I 

Now let Z = f^e-v^dr) and F(rj) = Z' 1 f^e^^dr]. If X is the random variable 
on R with distribution given by F, then F(X) is uniform on [0, 1]. Now, we define V{rj) = 
V(rj) - log g(F(rj)). Writing Z = f^e^^dr], it is also easy to check that Z e v is the 
density function for the random variable F _1 (G _1 F(X)), where G{rj) = g(()d(, is the 
distribution function for g. 

Lemma 4.1.4 The following are true for the V defined above: 

1. V(r]) > V(rj) - log 2. 

2. If (a, b) is the interval on which V is finite (here fl,6elU {— oo, oo} ), then 

+ - 

lim V \rj) = \imV(r]) = oo. 

rj^a 771 — >6 

3. V is convex. 

Proof Recall that V{rj) — V{rj) = — \ogg(F(r])). The first item follows because g < 2, the 
second because g(F(r))) tends to zero as rj tends to a or b. 

For the third item, since we are assuming that V is convex, it is enough to prove that 

A(r)) = V(r)) - V(r)) = -logg(F(ri)) = - logs f e~ v ^dy 

is also convex. From the symmetries of V and g, we can see that A{rj) = A(—r))] A is strictly 
increasing for rj < and decreasing for rj > 0. It is therefore sufficient to show that A is 
convex on the interval (0, 00). On this interval, using the definition of g given above, we can 
write 

/•oo 

A(77) = -log4(l-F(77)) = -log4-log / e~ v{0 dy. 

J n 
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Since this function is continuous, it is enough to check that for any < e < rj, we have 

2A(r]) < A{r] + e)+A(r}-e), 
or equivalently (by Fubini's theorem) we must show 

pOO pOO poo poo 

/ / e- y ^- y ^dadb- / e- v ^- v(b Uadb=>0. 

J r\ J r] Jrj+eJri—e 

Canceling the common region of integration, we can rewrite the left hand side as 

poo /"77+e prj poo 

/ / e- y ^- v ^dadb - / e- v ^- y ^dadb. 

J r) J i) J rj—e J rj+e 

Relabeling variables, this becomes: 

poo z^+e poo prj 

/ / e~ v ^' v ^dadb - / e- v ^- y ^dadb = 

J 7] J 7} J ??+e J Tj—e 

/ / e~ v ^- y ^ - e - v ^- v ^dadb. 

J rj J r\ 

Since a — b + e > throughout the region of integration, we claim that V(a + e) + V(b — e) — 
V(a) — V(b) > 0. To see this, we can write by convexity: 

Via) < Q ~ & + e Via + e) + r Vib - e) 

K J ~ a-b + 2e K ; a-b + 2e K ' 

Vib) < Q ~ b + e vib - e) + p Wa + e). 

y '~a-b + 2t y ' a-b + 2e v ; 

Summing these two lines gives the claim. From this, we conclude that 

e -V(a)-V(b) _ e -V(a+e)-V(b-e) > q 

throughout the region of integration, and the lemma follows. I 

Since we will use it again in Chapter |S1 we state the simple fact we used at the end of 
the proof as a lemma. 

Lemma 4.1.5 If a — b + e > 0, and V : R h- > R is convex, then V(a + e) + V(b — e) > 
V(a) + V(b). 
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4.1.3 Bounding FE® in terms of FE® 

Lemma 4.1.6 For each C\ > 2, there exists a C 2 such that for any edge e and measure \x 
on Q, we have 

FEf(») -C 2 < FE?(ji) < G\F Ef{fi) + C 2 . 

Proof The first half of the inequality follows immediately from 14.1.41 Next, since F is 
a continuously differentiable, increasing one-to-one function, one easily checks that if Y 
is any other real- valued random variable, then the relative entropy 3~C(Y, X) is equal to 
< K{F{Y), F(X)). (Here, when X and Y are real- valued random variables, we write "K{Y, X) 
to mean < K{fi Yl fix), where measures fi Y and fix are the laws of Y and X on R.) Define X 
and Z as in the previous section and recall by definition of free energy that if Y = <f)(y) — (j)(x), 
then FEfiji) = FE v {fi Y ), where FE v {fi Y ) + hgZ = X{Y,X); and FEf(fi) = FE v {fi Y ), 
where FE v {fi Y ) + log Z = H(Y, F- 1 G- l F{X)). We conclude from Lemma 0X3] that 

FEy(fi) + \ogZ = %{Y, F^G^FX) = ! K(F(Y), G- l F(X)) < 

C l %{F{Y),F{X)) + C 2 = dX(Y,X) + C 2 = CiFEv(ji) + C 2 + \ogZ. 



Lemma 4.1.7 For any G\ > 2d + l, there is a constant C 2 such that for any finite connected 
AcZ d and any measure \i on Q, we have 

FE*(jj>) - C 2 \A\ < FEj(fi) < dFEtin) + C 2 |A|. 

Proof The first half of the inequality follows immediately from 14.01 Next, we claim that 
that XlecA ^ffl 1 ) — 2dFE][(fi). We can prove this by invoking Lemma f2.3.1l if we assume 
that a used to define FE' in the setup to that lemma is equal to zero; for simplicity, we will 
assume this to be the case (the reader may check that without the assumption, the result 
remains true with a different value of C 2 ). Given Lemma l2.H.l| it is enough to show that 
there is a sequence ei,e 2 , . . . , e' A ' _1 of edges that forms a spanning tree of A and satisfies 

i=l eCA 

The latter point is a simple graph theoretic fact that is easily verified with a greedy algorithm: 
choose e\ to be an edge e for which FE^{p) is maximal. Choose each subsequent e* to 
be an edge e for which FF^{p) is maximal among those edges that do not, together with 
e±, . . . , e«_i, form cycles. The number of edges which form cycles at the ith step is at most the 
number of edges that are incident to vertices of ei, . . . , ej_i — this number is bounded above by 
2d(i — 1). Hence FE^. > FE^ {fx) where the latter represents the {2di + l)th edge when 

the edges are listed in rank order of free energy values. Since Ya=i~ 1 ^ 2< ^ ^E^ {^) > 

2d SecA this fact follows. 
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Using this fact in the last step, and taking C[ and C' 2 to be the constants from Lemma 
14.1.61 we now see that 



FEt(fi) - FElQi) = lift - fiHt = J>(V(e) - V(e)) = 5>i^) - F^(a*) < 

ecA ecA 

J2((C[ - l)FE v e ^) + C 2 ) < 2d(C[ - l)FEl(ti) + 2d\C' 2 \\A\. 

ecA 

We conclude the proof by adding FE®(n) to both sides of the above equation sequence and 
setting C\ = 2d(C[ — 1) + 1 and C 2 = 2d\C 2 \, recalling that C[ can be chosen arbitrarily 
close to 2. I 

Similarly, we have: 

Lemma 4.1.8 There exist C\ and C 2 such that 

SFE*{(jl) -C 2 < SFE^ifi) < dSFE 9 (n) + C 2 

for all fi. 

Proof This follows immediately from Lemma I4.1.7j simply take limits, using the definition 
of SFE. I 

Corollary 4.1.9 There exist C\ and C 2 such that 

C 1( j («) - C 2 < a*(u) < Ci<r*(ti) + C 2 

for all u. 

4.1 .4 Using $ to bound box measure free energies 

Lemma 4.1.10 There exist constants C% and C 2 such that whenever A is a connected subset 
of 7j d and Hf((f)) = e, there exists a box measure JL on R' A ' centered at <p whose gradient 
measure \i satisfies FEf(n) < Cie + C 2 \A\. 

Proof Fix rj > and let (— a v , a v ) be the largest interval for which ( G (— a v , a v ) implies 
V(C) — Vi 7 ]) + 1- Then we claim that for some C\ > 0, independent of rj and V, we 
have — log|a^ — 77 1 < C\A{rj) = Ci(V(r/) — V(rf)). For rj > 0, recall from that A(rj) = 
— log 4 — log e~ v ^>d(. Now, by convexity and the fact that V is positive, for any £ > a v , 
we have V(Q > ^^(C — a v)- This implies that 

e -v(QdC < / e~^d( = (a v - rj). 
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Furthermore, since V is positive, J^" 7 e v ^ < a v — 77. It follows that 

POD 

e -m = i / e - v(?) rfC < 8(a v -ri). 

Jr) 

Hence, 

A(t?) > - log 8 - log(a^ - 77). 

Now, for each v G A, choose e(u) to be the minimum of ie'^M"^" for all ?/ adjacent to 
f . By the construction of a^, the definition of e(v), and the bound in the previous paragraph, 
for all ip in the box {^\</>(v) — e(v) < ip < 4>{y) + e(v),v G A}, and any edge e, we have 
Hf(4>) < Hfijp). Summing over all edges, we have Hf(ip) < Hf((p) + 2d|A|. Now, let // 
be the box measure that corresponds to fixing if>(vo) = 4>{vq) for a reference vertex vq and 
choosing the other ip{v) independently, uniformly in (<j>(v) — e(v),</>(v) + e(v)). Since the 
widths of the intervals are 2e(v), this energy bound and Lemma \2. 2. 21 together imply 

FE*Qi)<H*(4>)+2d\A\- Yl MHv))- 

Now, for each v, we have — log2(e(w)) < swpi v _ v n =1 y eA A(<p(v') — 4>{v)). How does 
the sum of these supremum values compare to the sum over all edges? We know that 
A(t]) + log 2 > 0, for any 77. Thus, 

FE*M - Ht{$) - 2d|A| < 
£ sup A(<f>(v') - 00)) +log2 < 

£ A(<f>(v')-<f>(v))+log2< 

v£A,v^vo \v— v'\=l,v'(:A 

2 Yl [^(0K)-0O 2 )] + 2|A|log2 = 

e=(v\,V2)<ZA 

2[Hl{<j>) -Ht{4>)\ +2\K\\og2. 
The lemma thus holds with C\ = 2 and C 2 = 2 log 2 + 2c?. I 

Applying similar analysis to the case = <p u and A = 7L d yields the following analog of 
Lemma 14.1.11 

Lemma 4.1.11 There exist constants C\ and C 2 such that for all u there exists a box mea- 
sure /I centered at <p u whose gradient measure \i satisfies SFE®(fi) < C\V{u) + G 2 . In 
particular, cr*(w) < C\V{u) + C 2 . 

Lemma 14. 1.21 and 14.1.91 imply the other direction (that V(u) — C\ < cr(u)). Hence, we 
have the following: 

Lemma 4.1.12 There exist C\ and C 2 depending only on V such that 

V{u) -C 2 < < Ci7(u) + C 2 . 
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4.2 Torus approximations 



In this section, we allow $ to be an SAP or a perturbed SAP and allow either E = R or 
E = Z. One way to compute and approximate a is using the finite torus T given by Z d 
modulo &. A function g from the directed edges of T to R is called a local gradient function 
if the sum of g along any directed null-homotopic cycle in T is zero. If g were extended 
periodically to Z d , then it would be the discrete derivative of a height function cf) g (defined 
up to additive constant); we define the slope (or homology class) of g, written S(g), to be the 
asymptotic slope of the corresponding function <p g . Observe that ip g (x) = <f> g (x) — (S(g),x) 
is ^-periodic. 

Write H®(g) = ^ V{<fi g {y) — 4> g {x)) where the sum ranges over a set of edges in Z d corre- 
sponding to the edges T. Write x( u ) = m ^S( g )=u H®(g) and similarly x(u) — ^S( g )=u H®(g). 
From Lemma f4. 1.101 we can deduce the following: 

Lemma 4.2.1 There exist constants C\ and Gi such that o~{u) < C{x(u) + Ci. 

In particular, o~(u) is finite whenever x{u) is finite (which is true whenever xi u ) is finite). 
It is easy to see that the converse holds. Hence, we have the following: 

Lemma 4.2.2 If E — M. and $ is an ISAP, then the set [/$ is the interior of the set of 
slopes u for which x( u ) ^ s finite. 

We can easily generalize the above result the case E — Z. Let $ be the nearest neighbor 
potential with edge potentials given by V x>y . In this case, we let define V X) y : R i— > R to be the 
largest convex function which agrees with V Xty on the integers. This is a linear interpolation 
of V in between integers at which V XtV is finite; it is equal to oo outside of the interval 
spanned by the integers at which V XiV is finite. 

Now, given any (real) g on G for which if* is defined, we can extend this to a real-valued 
(j) g . Now, let /i be the measure on (Q, that returns \_4> g + ej where e is chosen uniformly in 
[0, 1]. It is easy to see that the restriction of this measure to (fi, J T ) is an ^-invariant measure 
of slope u, and that yu($) = H®/I where I is the index of L in Z d . In the discrete setting, 
the specific relative entropy of a singleton measure is zero and the specific entropy of any 
non-singleton measure is strictly greater than zero. We may conclude that SFE(n) < H®/I 
and a{u) < x( u ) whenever x( u ) is finite. Also, if a{u) is finite, then there exists a slope- 
u measure \x for which SFE(fi) is finite; let g be the local gradient function T for which 
4> g {y) — <fi g (x) = n{4>{y) — <f>(x))- It is easily seen that H®(g) must be finite in this case. 
Thus o~{u) is infinite whenever x( u ) is infinite. The reader may verify the generalization to 
all simply attractive potentials. 

Lemma 4.2.3 If E = R ; and $ is an SAP, then the set Uq> is the interior of the set of 
slopes u for which x( u ) ^ s finite. If E = Z, then U<& is the interior of the set of slopes u for 
which x{ u ) is finite. 



48 



When E = R and $ is an ISAP, then x( u ) is finite precisely when xi u ) is finite; but if 
$ is not an ISAP, then we have not defined x(u). However, if E = Z, then our definition of 
x( u ) applies for all SAPs. 

The reader may also check the following: 

Lemma 4.2.4 // E = Z and $ is a Lipschitz simply attractive potential, then set Uq> is the 
interior of the convex hull of the set § of integer slopes u for which xi u ) ^ s finite. 

The following is also trivial to verify: 

Corollary 4.2.5 If all of the convex nearest neighbor potentials V X)V : R h- > R are everywhere 
finite, then a is everywhere finite and U$ = M. d . 

Finally, we would like to define homology-class-restricted Gibbs measures on a torus, 
using a method similar to the one presented in [10]. Let x ,xi, . . . be representatives 
of a fundamental domain of Z d modulo £; these vertices are in one-to-one correspondence 
with the elements of T. Every local gradient function g is determined by its homology class 
and the values of the function <p g (with additive constant chosen so that 4> g {xo) = 0) at 
X\,X2, . . • , Xj-i- Now, we can define a measure on gradient fi u on local gradient functions g 
of homology class u by 

j'-i 
i=i 

where Zt is the normalizing constant that makes the above a probability measure, and the 
connection between g and 4> g is determined by the value it. 

Now, choose k so that kU 1 C £>, and define /z™ to be the measure produced as above on 
the torus T n = [0, kn — l] d that one gets by replacing £ by nkZ d . (If E = Z d , then when 
defining /i", we replace u with — |_tz.rz,J in order to ensure that the slope u homology class of 
finite-energy local gradient functions is non-empty.) 

Lemma 4.2.6 If u G U§, then some subsequence of measures //" converges in a topology of 
local convergence to a L-invariant gradient Gibbs measure [i e IP,c(£], 3 rr ) with finite specific 
free energy, which is less than or equal to liminf^oo — |T n | _1 \ogZ Tn . 

Proof The proof is very similar to the one in the proof of Lemma f2.4.2l — namely, one observes 
that there is an upper bound (independent of n) on the free energy of /x" restricted to a set 
A, then uses Lemma f2. 1 . 51 and Lemma \2. 1.71 to prove convergence for the /x™'s restricted to a 
set AcZ d (which can be treated as a subset of any sufficiently large torus), and then uses 
the standard diagonalization argument to get convergence for all A. I 

Statements similar to the above are proved for Ginzburg-Landau models in |3U] and for 
domino tiling models in [H| and [THj . 
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4.3 Distance functions and interpolations 



In this section, we give another perspective on the set Uq> and discuss the problem of interpo- 
lating functions defined on subsets of 7L d to finite-energy functions defined on all of Z d . The 
basic ideas are simple and very standard (see [2] for more exposition and many additional 
references; look for the keywords cycle decomposition and Dijkstra 's algorithm) . 

If x and y are neighboring vertices (as always, x preceding y lexicographically) and V x ^ y 
is the pair potential connecting the two, then define d(x,y) = sup^g^ V Xty (i]) < oo and 
d(y,x) = — inf v& E Vx,y(v) < oo. By definition, d(x,y) is an upper bound on the value that 
4>(y) — <f)(x) can take for any <fi with $r X}y \((j)) < oo. For simplicity, will assume here that 
the interval on which V X)V is finite is closed (i.e., V x>y is lower semicontinuous), so the upper 
bound is always achievable (this assumption does not affect the Gibbs kernels, since changing 
a convex V XjV to make it lower semi-continuous — by altering its values at the endpoints — only 
alters the potential on a set of Lebesgue measure zero). If x and y are not adjacent vertices 
but are connected by a path P = {x = Po,p±,P2, ■ ■ ■ ,Pk — y) of vertices, we define dp(x, y) to 
be J^j=o d{pj,Pj+i) and define D(x,y) to be the minimal value of dp(x,y) as P ranges over 
all paths connecting x and y. We assume that there are no negative cycles — i.e., no paths 
P from a vertex x to itself with dp(i, x) < 0. (Otherwise, all height functions on regions 
containing P would have infinite energy.) 

A classical theorem (see 2\) is the following: 

Lemma 4.3.1 // we fix the values of a function <$' (real if E is M., discrete if E = 7,) at all 
vertices in A C Z d , then there exists a finite-energy height function <fi defined on all of 7, d 
extending cf>' (i.e., satisfying 4>(x) = 4>\x) for x G A) if and only if D(x,y) > 4>'{y) — <f>'{x) 
for each x, y G A. 

Proof The proof is straightforward — simply observe that <j)(y) = ini xe \(f)'(x) + D(x,y) is 
such an extension function; in fact, this is the maximal such function. I 

A similar argument gives the following: 

Lemma 4.3.2 There exists a finite energy local gradient function g on a torus T with slope 
u if and only if there exists no path P = {po,pi, . . . ,Pk = Po} in T (with no vertices repeated 
except the first/last value) such that the lifting P = {po,Pi, ■ ■ ■ ,Pk} of P to the covering space 
Z d satisfies {u,p k -p ) < D p (p ,p k ). 

Proof Note that if E = R, then the gradient function g has slope u on T if and only the 
function f(x) = g(x) — (u, x) has slope zero (and hence extends to a periodic function on Z d ); 
also, g has finite energy with respect to $ if and only if / has finite energy with respect to 
the nearest neighbor potential $' defined by V xy (r]) = V X;V ((u,y — x) +7}). Now, the paths 
P described above correspond precisely to negative cycles with respect to d defined using the 
potential $'; it follow from Lemma 14.3. II that they fail to exist if and only if there exists a 
finite energy, zero-slope height function / (and hence a finite energy local gradient function 
g of slope u). 
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If E = Z, then the integer-valued gradient function g has slope u on T if and only 
the integer-valued function f(x) = g(x) — [(u,x)\ has slope zero (and hence extends to 
a periodic function on also, g has finite energy with respect to $ if and only if / 
has finite energy with respect to the nearest neighbor potential $' defined by V'(rf) = 
V x , y ( [(u, y)\ — \_(u, x)\ + rf). The argument proceeds as in the real case. I 

Let Pi, P2, . . . , -Pfc be the finitely many non-self-intersecting cycles in T. The above lemma 
and Lemma 14.2.21 imply the following: 

Lemma 4.3.3 The setUq, is interior of the set of values u for which (u,Xi — yi) < Dp^Xi^i) 
for 1 < i < k, where here Xi and yi are the starting and ending points of Pi. In particular, if 
is not the entire space M d , then it is the intersection of finitely many half spaces. If each 
V x , y is equal to 00 outside of a finite interval, then is the interior of a convex polyhedron. 

A finite energy function defined on all of Z d is said to be upward taut if for some x G Z d , 
and some C, every finite energy 0' : Z d ^ E which agrees with 0' on all but finitely many 
places satisfies (p'(x) < C; is downward taut if every such 0' satisfies <p'(x) > C. 

Lemma 4.3.4 A finite energy <p : Z d 1— > E is upward taut if and only if for some Pi and some 
x, the expression kD P .(xi, yi) — (x + k{%ji — Xi)) is bounded below for all k > 0. Similarly, 
is downward taut if and only if for some P i} the expression kDp.(yi, x^ — (x + k(yi — xi)) 
is above for all k > 0. 

Proof By Lemma f-P-lt is upward taut if and only if for some x, there exists a sequence yj 
arbitrarily far away from x for which <fi(y) — 0(x) > D(x, yj) — C. Let Qj be the corresponding 
paths connecting x and yj. Taking a subsequential limit of these paths gives a path Q from 
x to 00 with 0(y) — 0(x) > D(x, qj) — C for every qj in Q. The path Q, projecting down to 
the torus T, must completely traverse at least one of the paths Pi infinitely often. We refer 
to such a path as a C-taut path. (A zero-taut path is called simply a taut path.) Let x' be 
the first time in Q a vertex equal to the initial vertex of Pi, modulo T, occurs. Let z\ and 
z 2 be the first and last points of the first time the path Pi occurs. Then we can replace Q 
with a new C-taut path by moving the translating the path Pi so that it begins at x, and 
translating the path between x' and z\ so that it begins at x + (yi — Xi). Repeating this 
process, we can find arrange for Q to begin with an arbitrarily long sequence of Pj's. Again, 
we find that the path P consisting of an infinite sequence of P«'s is also a C-taut path. A 
similar argument applies to downward taut functions. I 

Lemma 4.3.5 //$ is simply attractive and fi is an L-ergodic gradient Gibbs measure with 
finite specific free energy and S(fi) = u G U<$>, then fi-almost all are not taut. 

Proof If is taut, then for some Pj and C, there the ergodic theorem implies that there 
almost surely exist a positive fraction of vertices which are the beginnings of infinite C-taut 
paths formed by concatenating Pj's end to end. We claim that each of these paths must in 
fact be 0-taut; suppose otherwise. Then along one of the infinite sequence x + Z(?/j — Xj), 
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there would be have to be a place where, for some e, x was the last value in the sequence to 
be the beginning of an e-taut path. But only one such event can occur in each path; such 
events cannot occur with /i positive probability because \x is invariant under translation by 
k{yi — Xi), for fc£Z. 

A similar argument shows that in fact every x' G x + £ must belong to an infinite taut 
path of this form; it follows that the slope u of fi satisfies (u, — yi) = D P .(xi,yi) — and 
hence, it lies on the boundary of [/$, by Lemma [4.3.31 I 

Note that from the proof, we also have: 

Lemma 4.3.6 Let E = 7L, and u G dUq, and let n be a L-invariant measure of finite specific 
free energy and slope u. Then for some x and some Pi, the path formed by concatenating 
infinitely many copies of Pi starting at any x + y, with y G £, is almost surely taut. In fact, 
for each face X of the polyhedron dU<$,, there exists at least one Pi such that the preceding 
statement is true for Pi whenever u G X . 

Proof Let Pi be a path which determines the face X (as in Lemma [4.3.3)1 . I 

We can also conclude that if \i G "Puifi, < 3 rr ) is any gradient measure supported on finite 
energy functions 0, then <fi{x2) — 4>{xi) is yU almost surely constant whenever x\ and x 2 lie 
in a path P of the type described above. It is easy to see that such a measure must have 
infinite specific free energy if E = R; but it will have finite specific free energy if E = Z. We 
can lift the path P defined on T to an infinite path in Z d along which all height differences 
are "frozen" by fi. Whenever fi has slope u which lies on the boundary of £7$, we say that 
/i is taut and that u is a taut slope. It is not hard to check the following: 

Lemma 4.3.7 Let $ be a simply attractive potential. If E = Z, then o extends continuously 
to a real-valued function on the closure ofU<$>. If E = R, then a{u) tends to infinity as u 
approaches the boundary of U<$> and c(u) = oo on the boundary itself. 

We can approximate D(x,y) as follows: write _D$(x) = sup ugl/<l> (u, x). 

Lemma 4.3.8 There exists a constant C such that \D$(y — x) — D(x, y)\ < C for all x and 
y in Z d . 

Proof Using the above arguments of Lemma 14.3.21 for each u, we can construct a finite- 
energy function which, restricted to any offset b + £, is equal to a plane of slope u. It is 
easy to see that for any b G £, there exists some a in a fundamental domain of £ for which 
D(a, b + a) = D§{b). The result follows by taking C = 2 sup \D(z 1 , z 2 )\ +2 sup \D$(z 2 — Zi)\ 
where Z\ and z 2 are members of that fundamental domain. I 
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4.4 Gradient phase existence 



Lemma 4.4.1 If u is either an exposed point or a convex combination of exposed points of 
a, then there exists a measure /i G T,c(fi, 9 rT ) with SFE(n) = a{u). If u is an exposed point 
of a, then fi can be taken to be ergodic. 

Proof For w G M. d , let ty w be the nearest neighbor potential defined by 

= [</>(*) -<P(v)](w,z-v)- 

Define $™ = $ + ty w . It is not hard to verify (using the discrete fundamental theorem of 
calculus) the equality of the Gibbs kernels: 



as well as the identity SFE® W (/i) = SFE^^i) + (w, S(u)), which in turn implies <t$™(u) = 
a<s>{u) + (w,u). Now, if o$ has an exposed point at u, then we can replace $ with a 
$ w for which <t<j>™ has a unique minimum at u. By Lemma 12.5.11 there exists a measure 
fi with SFE(fi) = a<$,w{u). Clearly, such a measure has finite slope, so (by definition of 
o~q>w) it must have slope u and specific free energy equal to a{u). Furthermore, Jensen's 
inequality, convexity of SFE, and the ergodic decomposition theorem imply that the ergodic 
components of fi, with probability one, must have slope u. Now, suppose u is not an exposed 
point but u is a convex combination of exposed points u = aiUi. Let /i n denote an ergodic 
measure of slope u with SFE(fi u ) = cr(/i M ). Then Lemma 13.3.41 implies that /i = J^* 3 */ 1 ^ i s 
a Gibbs measure of slope u with SFE(n) = a{u). I 

We say V : E > K is super-linear if for all c > 0, there exists a b > such that V(ri) > c\rj\ 
whenever \r]\ > b. Say $ is a super-linear simply attractive potential if $ is simply attractive 
and all of the nearest neighbor potentials of $ grow super-linearly. The following is a simple 
consequence of Lemma I2.3.8I and Lemma I2.4.2I 

Lemma 4.4.2 If $ is a super-linear simply attractive potential, then fii —>■ fi (in the topol- 
ogy of local convergence) and SEE® (fii) < C for all i together imply S(fii) — > S(fi) and 
SFE^(fx) <C. 

4.5 A word on Lipschitz potentials 

Assume that E — Z. Many naturally arising discrete random surfaces (height functions 
for domino tilings, square ice, etc.) have the property that the nearest neighbor potential 
functions are equal to infinity outside of a finite interval. A Lipschitz potential is a potential 
$ such that for each adjacent pair of vertices x and y, the potential ${ Xi3/ }(0) is equal to oo 
when <f)(x) — (p(y) lies outside of a bounded interval of Z. The reader may easily check the 
following: 
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Lemma 4.5.1 A gradient, nearest neighbor, SL-invariant potential $ is Lipschitz if and only 
if it is a perturbed LSAP. 

By Lemma 14.3.31 Uq> is the interior of a convex polyhedron and by Lemma 14.3.71 a is 
a bounded, continuous function on the closure of U$. Discrete Lipschitz potentials (i.e., 
Lipschitz potentials in the case E = Z) are convenient to work with for many reasons. First 
of all, an ^-periodic Lipschitz potential $ can be completely described with a finite amount 
of data. And although it is generally not known how to compute a exactly, one can use 
simulations (and the alternate definition of o given in Chapter [HJ), to approximate a to 
arbitrary precision. Also, for discrete LSAPs, Propp and Wilson give an algorithm (called 
"coupling from the past") for perfect sampling from 7* (*,<f>), where A is a finite subset of 
Z d and 6 is an arbitrary "boundary function" (which need only be defined on the boundary 
of A) [TB]. Finally, given any discrete SAP $ and C > 0, we can approximate $ with a 
Lipschitz potential, <3> c , defined by 

$a($)<C 
I 00 otherwise. 

It is not hard to show that as C gets large, the variational distance between the measures 
7a(*)0) an d 7* (*) 0) decays exponentially. 



4.6 Examples: layered surfaces, non-intersecting lat- 
tice paths 

Fix d, and let $ be a simply attractive potential. We can use a Gibbs measure of $ to choose 
a single random surface : Z d 1— > Z. Now, we would like a way choose a sequence of "layered 
surfaces" (pi, defined for i G Z, in such a way that 

1. For each x G Z d and i 6 Z, < (pi(x + 1). 

2. Given 0i_2, and the values <pi{x) for x outside of a finite set A, the conditional 
distribution on the values <f>i(x) for x G A can be described as follows: it is the measure 
7a(*,0i) conditioned on 0j_i(a;) < ^(x) < <pi + \(x) for all x G A. 

The natural way to do this is to replace $ by a d + 1-dimensional Write 0'(x, z) = 4>i(x), 
where x G Z d and i 6 Z. Then we write $// .n ,„ „. u = „ and 



$ {(x,i),0r,i+l)}( 



{(a;,i),( V ,i)} ~ ^^.y 

/3(0'(x, i + 1) - 0'(x, z)) 0'(a;, i + 1) > 0'(x, i) 
00 otherwise. 



Intuitively, we would like to set (3 = 0. In this case, however, $' would not be simply 
attractive. But since the kernels 7$/ are independent of (3, we can get around this by fixing 
P to be an arbitrary positive constant. 
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We can think of a Gibbs measure in S*' as a way of choosing a sequence of layered 
surfaces. This construction also makes sense if we replace E = Z by E = R. It is now 
easy to check that U& — U§ x (1, oo) when E = Z, and = £/$ x (0, oo) when = R. 
If k = (111,112), with Mi G R d and 112 G R, then we can think of a random function <$' of 
slope u as a sequence of layered slope u\ functions (pi, spaced apart with density l/u^. It 
is a consequence of the results in Chapter |H] that, at least when E — R, the minimal-SFi? 
ergodic slope-tii, density l/u 2 layered surface Gibbs measure is unique. In the special case 
d = 1 and E = Z, we can think of layered surfaces as non-intersecting paths in a lattice. 
(The latter appear frequently in the theory of random matrices.) 
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Chapter 5 

Analytical results for Sobolev spaces 



In order to prove our large deviations principles on surface shapes in later chapters, we will 
need to construct a topology in which certain sets of bounded-average-energy surfaces are 
compact. To this end, we will ultimately convert some of our questions about discretized 
surfaces into analogous questions about continuous functions. The topologies on the contin- 
uous function spaces will be generated by the Orlicz- Sobolev norms defined in this chapter, 
which are generalizations of LP norms that arise when the function | • \ p is replaced by a more 
general positive, convex, symmetric function A; it will generally be desirable to choose A in 
such a way that this topology is as strong as possible, since this will lead to the strongest 
large deviations principles and concentration inequalities. Thus, for a given potential $, we 
would like to determine the strongest topologies in which the necessary compactness results 
hold. 

In Section we will cite several known results (compact imbedding theorems and var- 
ious bounds) about Orlicz-Sobolev spaces on a bounded domain D C M, d . We cite these 
results without proof from pQ, ^H], and |7Tj. While much of this theory is classical, some of 
the results we employ have been proved during the last few years — including the strongest 
versions of the imbedding theorems and some of the bounds we need. A summary of these 
recent results (many by Cianchi) can be found in |16j . 

The results in this chapter will assist us primarily when $ is an ISAP (or a perturbed 
ISAP). When $ is an LSAP (or perturbed LSAP) the variational and large deviations princi- 
ples can be derived without the theorems of this chapter. If V : R i— > [0, oo] is a non-constant, 
even, convex function, as always, we will denote by <&y the ISAP whose nearest neighbor 
gradient potentials are given by V. In Section 15.21 we will describe how the Hamiltonians 
i?* v (0) and H®v(4>) can be approximated via energy integrals of continuous interpolations 
of 0; we also define the "good approximations" of bounded, sufficiently regular domains D 
(by subsets of that we will use in the large deviations principle of Chapter We ex- 
tend the compactness results of Section |5~T1 to families of functions defined on approximating 
subsets of D (and not necessarily all of D). 

In Section 15.31 we derive a technical result we need for both the proof of the variational 
principle in Chapter El and the proof of the large deviations principle in Chapter [7| Finally, 
in Section 15.41 we show how to approximate certain functions / : D 1— ► R by functions that 
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are "mostly linear" and whose "energy" is not much larger than the energy of /; this result 
will be useful in proving the lower bound on probabilities in the large deviations principle of 
Chapter [7| 

5.1 Orlicz-Sobolev spaces 

5.1.1 Orlicz-Sobolev space definitions 

We define 

W j,p (D) = {f E L P {D) : D a f G L P {D) for < \a\ < j}, 

where a = (a%, . . . , a^) is a multi-index, with < ctj G Z d , and D a is the distributional 
derivative. (Here, we use the definition \a\ = Ylt=i a i-) Define a norm by 

n/iii,= ( E w Da f\\i 

\0<\a\<j 

for 1 < p < oo and ||/||j )DO = su Po<H<j I |-D a /||oo- These are Sobolev spaces. 

We define a Young function to be a convex, even function A : R i— > IR + U {oo} for which 
A(0) = 0, A is finite on some open interval (—a, a), and A is not identically zero (which 
implies that A(rj) grows at least linearly in r] for rj large). Observe that for an isotropic 
simply attractive potential we may and will assume (adding a constant to V if necessary 
to make V(0) = 0) that V is a Young function. When A is a Young function, the Orlicz 
space L A (D) is the space of functions / :Di->1 for which the norm 

\\f\\w = M{k\jA{I&)dn<i} 

is finite. We write ||/||a,£» simply as \ \f\\A — an d L A (D) as L A — when the choice of D is clear 
from context. The Orlicz-Sobolev spaces are the spaces 

W j ' A (D) = {f e L A (D) : D a f e L A (D) for < \a\ < j}. 

Clearly, Sobolev spaces are Orlicz-Sobolev spaces defined using the Young functions A(rj) = 
\r]\ p for some p. From here on, we will assume j = 1 and deal only with the spaces W 1,A . 

If v — (v i, . . . ,Vd) G Z d , then we write A(v) = Yli=i A(vi). We write \v\ for the Euclidean 
norm of v. Since supf =1 \vi\ < \v\ < dinf i=1 \vi\ and A is convex (with A(0) = 0), it is clear 
that A(%) < ^4(|^|) < A(dv). Now, there are two natural ways to define ||V/||a,d; one is 

\\Vf\\ A)D = MmJ^A ft^jj^) dr, < 1}. 
The second is the same but without the absolute value sign on V/, i.e., 

\\Vf\\ A , D = M{k\J D A(^^J dr, < 1}. 
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The above discussion implies that the two norms thus defined are equivalent — they differ 
from one another by at most a factor of d. Unless otherwise specified, we will always use the 
second definition. Some of the theorems we cite (compact imbeddings, etc.) are proved in 
papers which use the first definition; however, all these cited results obviously remain true 
when the norm is replaced by an equivalent one. 
We now cite the following (Section 2.3 of [16J): 

Lemma 5.1.1 The spaces W 1,A , equipped with the norm WfWw 1 ^ = ll/IU + ||V/||a> are 
Banach spaces. 

We define L X,A to be the set of weakly differentiable functions / : D t— > R with | V/| G L A (D). 
Note that W l > A (D) = L A (D) n L l ' A {D). We will later see that W X ' A {D) = L 1 ' A (D) on all 
domains D of interest to us. 

5.1.2 Regular domains: the domain class G(^-) 

Denote by fp the mean |-D| _1 j D f(rj)dr), where \D\ is the Lebesgue volume of D. (Unless 
otherwise specified, all integrals over D are with respect to Lebesgue measure.) The following 
kinds of assertions are starting points for the Orlicz-Sobolev theory: 

1. For some constant C, \ \f— /d|U* < C\ | V/| \a (for appropriately chosen Young functions 
A and A*). 

2. For some constant C, ||/|U* < C(||/IU + ||V/||a) (where A is a Young function and 
A* is some appropriately chosen Young function that grows more rapidly than A). In 
other words, the imbedding of W 1,A (D) into L A * (D) exists and is continuous. 

Both kinds of results depend on regularity properties of the domain D. Even without 
specifying A and A*, we can imagine what might go wrong if we did not have regularity 
conditions. Suppose D consists of the interior of Db U D\ U D<i where D\ and D2 are 
very large closed cubes (not necessarily of equal volume) and Df, is a long but very skinny 
rectangular tube (a "bottleneck") connecting them. Then take / to be a function that is 
equal to on Di and M on D 2 , and varies linearly between between and M within D^. 
By making M sufficiently large and making Df, sufficiently long and skinny, we can make 
11/ — /d|U* arbitrarily large even while making ||V/||a arbitrarily small. 

We can imagine that if D had infinitely many (increasingly skinny) bottlenecks of this 
form, it might be possible — by choosing / equal to zero on one side of some bottleneck 
and some value M on the other side — to produce functions / on D for which ^^°|^ A * is 
arbitrarily large, contradicting the first assertion. Similar problems arise with the second 
assertion. Both assertions require a bound on the volume separation of the bottleneck (i.e., 
the amount of mass on the smaller side of the bottleneck) in terms of its width. 

To be precise, define P(E;D) to be the perimeter of E relative to D, i.e., the total 
variation over D of the gradient of the characteristic function of E. (When E has a smooth 
boundary, then P(E, D) is simply the (n— l)-dimensional Hausdorff measure of dEC\D. See 
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Section 6.1.1 of [7U] for more details.) Let G(z) be the set of bounded domains {D C M n } 
for which there exists a constant C such that 

[min{|£|, \D - E\}] z < CP(E; D) 

for all Lebesgue measurable subsets E of D, where | • | denotes Lebesgue measure. In 
particular, when z = this is a very natural restriction, which we write 

min{|£|, \D-E\}< CP(E; D)&. 

This is natural because we can interpret CP(E; D)~^ is a constant times the volume con- 
tained in a sphere with surface area P(E, D). The infimum of the C for which this inequality 
holds is called the isoperimetric constant of D. The space 

G(j=i) includes many families 
of domains for which Sobolev-type results were proved classically, including bounded sets 
satisfying the cone property (see Corollary 3.2.1/3 of [7U] and Section 2.4 of ^H]), the strong 
local Lipschitz property, and the uniform C m -regularity property for m > 1 (see 4.3 to 4.7 of 

ED- 

There is a range of weaker Orlicz-Sobolev theorems that apply when weaker regularity 
conditions are placed on D (see, e.g., Remark 3.12 of [IE]); it may be possible to use these 
more general results to prove weaker large deviations principles for random surfaces on 
mesh approximations of these more general domains. (See Chapter E3) However, we will 
limit our attention to the domains G(^r)- Random surfaces on what we will call "good 
approximations" of these domains are especially convenient because they have the same 
large deviations properties as random surfaces on the standard approximations of the unit 
cube [0, l] d . 

5.1.3 Comparing Young functions 

When A and B are Young functions, we say that B dominates A near infinity if there exist 
positive constants c\ and c 2 such that A (77) < B{c\r\) for 77 > c 2 . If for every c\ > 0, there is 
a C2 for which this holds, we say A increases essentially more slowly than B. We call A and 
B equivalent near infinity if each dominates the other near infinity. The following fact is a 
motivation for this definition. (See Remark 3.3 of [TB*].) 

Lemma 5.1.2 If D has finite volume and the Young functions A and B are equivalent near 
infinity, then the Luxemburg norms \\f\\A an d I l/l Is are equivalent norms. 

5.1 .4 Sobolev conjugates 

Following Section 3.2 of ^B], given n > 2 and a Young function A, we define an increasing 
function H : [0, 00) 1— ► [0, 00) by 

d-l 
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and A d : [0, oo) i— > [0, oo) by 



A d = A o H 



-i 



where H 1 is the left-continuous inverse of H. We extend A d to R by A d (r]) = A d (—r]). 
Note: in order for H to be finite and well-defined, we have to assume that for c > 0, we have 



If this is the case, we say that A has a conjugate A d . If this is not the case, we can replace 
A by any equivalent Young function for which the integral does converge and define H 
and A d using that Young function instead; we call this A d an equivalency conjugate for A. 
(Such an equivalency conjugate always exists by Remark 3.3 of [IE]-) Also, note that if 
C = XT '(^fl))^ 1 = 00 ' then A d (i]) is everywhere finite. Otherwise, it is infinite for \rj\ > C. 
In particular (as the reader may easily check), A d (i]) is everywhere finite if A{rj) < r] d and 
infinite outside an interval if A (77) > r] d+e and e > 0. 

We cite a concrete example from Example 3.17 of If A(rj) is equivalent near infinity 
to r) p (\og(r])) q , then A d {rf) is equivalent near infinity to 



If either p > n, or p = n and q > n — 1, then is equal to 00 outside of a finite interval. 
We use A* to denote a sub-conjugate of A, i.e., any Young function that increases essentially 
more slowly than A d . 

5.1.5 Imbeddings 

We cite the following (Theorem 3.9, Remark 3.10, and Theorem 3.13 of |16j): 

Theorem 5.1.3 Let A be any Young function with conjugate A d and D e G(^-). Then 
the following are true: 

1. There exists a constant K , depending only on A, the volume of D, and the isoperimetric 
constant C of D such that for any f G W 1,A , 



is well-defined and continuous. Given A, if A has a conjugate (not merely an equiva- 
lency conjugate ) A d , then K depends only on C ( and the same K holds if C is replaced 
by any Cq < C). 





\f-f D \\A d <K\\Vf\\ A . 



2. The imbedding 



W 1A (D) ^ L 
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3. In each of the two previous items, the Young space L Ad is the smallest Orlicz space for 
which the result is true. 

4- If A* is any Young function increasing essentially more slowly near infinity than Ad, 
then the imbedding 

W 1A (D)^ L A *(D) 

is compact. 

(The fact that the K in the first statement that holds for C also holds for any Co < C is 
not stated explicitly in [16J, but it appears to be understood in the context. In any case, it is 
not hard to see that if a counterexample to the statement exists — in the form of a function / 
on a domain D — for some C and K, then a counterexample also exists for K and any larger 
C. The counterexample can be obtained by removing a zero-volume subset of D to produce 
a new set D' with the appropriate larger isoperimetric constant — and then replacing / with 
its restriction to the D'.) 

The following corollary is useful, as it implies that whenever we can prove / £ L 1 ' (D), 
we can apply Theorem 15.1.31 to produce a bound on the L Ad norm. 

Corollary 5.1.4 If D £ G(^p) ; then the spaces L 1,A (D) and W l,A are equivalent. 

Proof Suppose that ||V/||a is finite but \\S\\a infinite. Then we can take a truncation 
f M (v) equal to f(rj) when 1/(77) | < M, M for f(rj) > M, and —M for f(rj) < -M. Write 
9m = /m + cm where the constants cm are chosen in such a way that g M has mean zero. As 
M tends to 00, 1 1 V<7m|U tends to 1 1 V/| \a and | \g\t\ \a tends to infinity. To see the latter fact, 
observe that if / has finite mean f D , then constants cm converge to fo and ||<?m|U converges 
to I l/l I a- If / does not have finite mean, then ||<7m||i tends to infinity; hence ||<?m|U tends to 
infinity for any Young function A. These facts imply that the ratio | \gu\ \a/\ I V<7m| U grows 
arbitrarily large, contradicting Theorem 15.1.31 I 

Now we can make a statement closer to the form we will actually need it to be in for our 
proofs: 

Corollary 5.1.5 Let A be any Young function which has a conjugate A^ and D £ G(^-) ; 
and let A* be any function that A^ dominates near infinity, and C a positive constant. 
Then there exists a constant K , depending only on A, A* , and C , such that whenever the 
isoperimetric constant of D is less than C and f £ W 1,A , 

Wf-foU<K J A(Vf( V ))drj. 

Moreover, if A* increases essentially more slowly than Ad, then the mapping from L 1,A {D) 
to L A that sends f to f — fn is compact. 
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Proof This follows from Theorem 15.1.31 and a couple of simple observations. The first is 
that the convexity of A, the fact that A(0) = 0, and Jensen's inequality imply that 

f A(Vf( V ))d V <\\f\\ A . 

The second is that the image of the unit ball in L 1,A is a subset of the image of the unit ball 
of jy 1 '" 4 ; since the latter image is precompact, the former is also. I 

5.2 Connection to the discrete settings 
5.2.1 Simplex interpolations and discrete norms 

Given w = (wi, u>2> • • • , Wd) G R d , write [w\ = ( [wi\ , [iU2\ , • • • , [wd\ ), where for any real n, 
\r)\ is the integer part of rj. Also, let s(w) G S d be the permutation (uniquely defined for 
almost all w) that gives the rank ordering of the components of w — \ w\ . For each vertex 
v G 7j d and s G S d , we denote by C(v , s) the closure of the simplex of vertices w with [w\ = v 
and s(w) = s. 

We say that a domain in is a simplex domain if it is the interior of a finite union of 
simplices of the form C(v, s). We say that a subset A C Z d is a simplex boundary set if 
it is the union of the corner sets of the simplices in a simplex domain (denote this simplex 
domain by A) and if every adjacent pair of vertices A forms an edge of at least one of 
these simplices. In a sense we describe precisely in the next subsection, we will often fix a 
domain D C M. d and choose a sequence of simplex boundary sets D n so that the domains 
D n = -D n are increasingly close "approximations" of D. We will assume that the volume 
of the D n , written \D n \, is bounded between positive constants C\ and C2, independently of 
n. To summarize: 

1. D is a domain in W 1 . 

2. D n C M. d is an "approximation" (to be precisely defined later) of nD. 

3. D n is a simplex domain in IR d derived from D n ; it is an "approximation" of nD. 

4. The normalized domain D n = \D n is an "approximation" of D. 

Given a function <fi : D n i— > E, denote by (f> : D n i— > M. the unique function that extends 
to the closure of D n in such a way that it agrees with on D n and is linear on the closure 
of each simplex of D n ; define a rescaled function : D n i-» 1 by 0(77) = -4>(nr]). Again, to 
summarize: 

1. Begin with : D n — > E. 

2. Then ra — > E is linear interpolation of to D n . 

3. : D n 1— > 1R by 0(77) = ^(p(nrj) is a rescaling of 0. 
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Write 



yeD n 
and similarly 

{z,y}Cl>„,|x-j/|=l 

Here if is the Hamiltonian of the ISAP $,4. Note the normalization by the size of D n built 
into these definitions. Roughly speaking, the first gives the average value of A((f>(-)/n) on D n ; 
the second gives the "energy per site" of 0. The following simple lemma gives a connection 
between continuous and discrete energy integrals. For any continuous / : D — > R, write 
MW) = h n AW( V ))d V and £ D (0) = 

Lemma 5.2.1 Assume (as above) that the volume of D n is bounded between two positive 
constants c\ and c 2 . Then there exist positive constants C\ and C 2 ( independent of n) such 
that for any <fi : D n 1— > R ; we have 

Ci£„,a(V0) < £ D (V0) < C 2 £ n , A (V0) 

/or all (j). There also exist positive constants C\ and C 2 (independent of n) for which 

E n , A (Ci<P) < £d(0) < £n,A(C 2 0). 

Proof For the first statement, let C(v, s) be a simplex of D n and let v °, f v 2 , . . . ,v d be the 
edges in the path starting at f = t> and stepping at the ith step on unit in the s(i)th, so 
that v d — v + (1, 1, . . . , 1). The vertices of this path are the vertices of C(v, s). Observe that 



A{vm)dri=± i [ A(|V0|) = -i-^A(0K)-0K_ 1 )) 

n JC(v,s) n a - i=1 



$C(v,s) n JC(v,s) 

Since D n has volume bounded between two positive constants, we may deduce that the ratio 
of and |^-| is also bounded between two positive constants. The result now follows from 
the fact that every edge of D n is an edge of at least one simplex and at most d\ simplices of 
D n — and each edge of a simplex of D n is also an edge of the graph D n . 

For the second statement, by arguments similar to those above, it is enough to prove the 
result for the case that D n consists of a single simplex C(v, s). That is, it is enough to show 
that when D n is the set of vertices of a single simplex, then for some C\ and C 2 

MCi<Kx)) < [ A(fa))dri < A (C2<t>(x)), 

xGD n JC(v,s) xeDn 

and this is a simple exercise. I 
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Given the sets D n as above, we can now define discrete norms 

||0|U, n = inf{£;:£ A , n (0/A;)<l} 

and 

||V0|U, n = inf{A;:£ A , n (V0A)<l}. 
The following is an immediate consequence of Lemma 15.2.11 

Corollary 5.2.2 Let D n be a sequence of simplex boundary sets and assume that the volume 
of D n lies between two positive constants. Then there exist positive constants C\ and C2 — 
independent of n — for which the following hold for any <fi : D n \— > R; 

c 1 |i0iu, Cn <ii0m A <c 2 |i«/»iu iCn 

^l||0ilA A <iiWlU A <^||0|iW 

In particular, this result implies that Lemma 15.1.31 and Corollary 15.1.41 remain true for 
spaces of functions defined on A n if we replace the continuous norms with discrete ones (and 
the relevant constants in these bounds are independent of n). 

5.2.2 Good approximations of domains D G G(^p-) 

Let dD denote the boundary of D. Given a sequence of subsets D n of Z d , we say that the 
sets ±LL are good approximations of D if: 

1. D n C Z d HnD for all n. 

2. limbec sup{|£e- 2/| :xe [Z d fl nD]\D n , y G dD} = 0. 

3. For each x G D n and y G dD, \x — nyl^ > 1, where | ■ ^ is the supremum norm of a 
vector. 

4. Each D n is simplex boundary set. 

5. For some C, each D n has isoperimetric constant less than or equal to C. 

The first item states that -D n approximate D from within. The second one states that 
the approximation gets progressively better as n gets large; it implies, that every compact 
subset of D is contained in D n for all sufficiently large n. The third is a technical condition 
that requires that we exclude from D n points that are too close to the boundary of nD. 
This condition ensures, for example, that every edge in D n will be completely contained 
in nD. The fourth condition makes it possible to use Section T5.2I to approximate energies 
of functions on D n with continuously defined energies of continuous functions on D n . The 
fifth condition prevents us from using a sequence of approximations in which the bottlenecks 
become increasingly severe with n; in particular, it implies that each D n is connected. 
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Figure 5.1: A possible simplex domain approximation for a domain D 



Finally, for the purposes of our large deviations principles, we will need a topology — 
similar to the L A * topology, for an appropriate Young function A — on a space which includes 
the functions from D n to R for all n. Suppose that \D n are good approximations of D. Let 

Lf(D) = L A *(D)U™ =l L B (D n ). 

For notational convenience, we write = D. Assume A has Sobolev conjugate Ad and 
that A* increases essentially more slowly than Ad near infinity For any 0i and 02 in L A * (D) 
(defined on L>i and Dj respectively), write <5(0i, 02) — ||0i — < t > ^\\A*,b i r£>j + ~~ Aj'Ij wnere 
\Di — Dj\ denotes the Lebesgue measure of the symmetric difference of these two sets. We 
treat L A (D) as being endowed with the topology generated by this metric. If is defined 
on D n , write £(V0) = J 3n A(V(0,)). 

Lemma 5.2.3 The set Xc of zero-mean functions G L A * (D) with £(V0) < C is compact 
in Lf{D). 

Proof Since L A * (D) is a metric space, it is sufficient to prove that X c is sequentially 
compact. Let 0j be any sequence of functions in L A *(D); if an infinite number of the 0» are 
defined on the same D n , then the existence of a convergent subsequence follows from Lemma 
15.1.31 Otherwise, we may assume that 0j are defined on D n where n is increasing in i. By a 
diagonalization argument, we can choose a subsequence m{i) of the integers, indexed by i, 
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whose restrictions to D n (which are defined when i is large enough so that 0j is defined on a 
superset of D n ) converge, for each n, to a limit in L A *(D n ) (again, by Lemma [5.1.3|) . Thus, 
there is a unique function such that 4> m (i) converges to on every compact subset of D. 
We need only to check that (p G L A * (D) and that the <fi m (i) also converge to 4> with respect 
to the metric 5. 

Since there is a uniform upper bound on the isoperimetric constants of the D n , by The- 
orem there is a uniform upper bound k on 1 10^| \ A , and hence on f, n as well. 

If we had J D A*(^Y^-)dr] > 1, then for some i } we would have A*(^Y^-)drj > 1, a contra- 
diction. Hence ||</>||a*,d < k. 

Next, we know that for ever e and n there exists an N such that for all m > N, we have 
\\4>m — 01 U* D n < e - Let ip^ = l D \i) n (f) m — (ft; it will be clear that 5(4> n ,(f)) — > if we can 
show that | IV'nl U*,£> tends to zero for every fixed e. This follows easily from the fact that 
llVnlUd,- is uniformly bounded (independently of n), that the volume of the sets on which 
ipn are supported is tend to zero in n, and that A* increases essentially more slowly near 
infinity than Ad- I 

The following corollary is the only compactness result we will actually need in our proof 
of the large deviations principle (in the space L A * (D)) for random surfaces defined using 
isotropic convex nearest neighbor potentials. 

Corollary 5.2.4 Let Yc he the set of zero-mean functions of the form (p, where <fi : D n \— > R 
satisfies 

H° Dn {^) < C\D n \. 

Here H is the Hamiltonian for the isoperimetric convex nearest neighbor potential determined 
by A. If A has a conjugate A^ and A* increases essentially more slowly than A^, then Yq is 
precompact in L A * (D). The same is true if A does not have a conjugate but A^ is a conjugate 
of a Young function equivalent to A. 

Proof When A has conjugate A^, the statement follows immediately from Lemma lo.2.1l and 
Lemma [5.2.31 When A does not have a conjugate, we can replace A with an A Q that does 
have a conjugate and for which \A(ri) — ^0(^7) I < b for i] G R. (See Remark 3.3 of |16j.) 
Thus, |Z^„| _1 -ff£, ri (0) defined using A differs only by a constant amount from the analogous 
expression defined using A . Since D has finite volume, the Y c defined using A is a subset 
of some Yc , defined using A in place of A, and the corollary follows. I 

5.3 Low energy interpolations from L 1 and L 1, bounds 
5.3.1 L A bounds from L 1 and L 1,A bounds 

Let C G R be an arbitrary constant and let Sa,c(D) be the set of weakly differentiable 
functions / on D with ||V/||a < C. The following theorem states that the imbedding of 
L 1 (D) n Sa,c(D) (with the L 1 norm) into L A fl Sa,c (with the L A norm) is continuous. 
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Lemma 5.3.1 Let A be a Young function and D G G(^p) and C > 0. For every e, there 
exists a 5 = S(C, e, A) > such that the following two conditions 

i- l|v/m<c 
*■ ll/lli<* 

together imply \\f\\A — e - The lemma remains true if we replace the first statement with the 
modified statement: j D A(V f(j]))dr] < C. 

Proof First of all, if ||V/||a < C, then g = f/C satisfies f D (V f(r}))dr} < 1 — i.e., g satisfies 
the modified statement with C — 1. It will therefore be enough to show that the modified first 
statement and the second statement together imply the conclusion. Statements 2.21 (and 
subsequent discussion) and 2.25 of JH| indicate that for some constant K (depending only 
on D and A), we have ||V/||a > ||i^V/*||A whenever /* is a positive, radially decreasing, 
spherically symmetric "rearrangement" of / defined on a sphere D' with the same volume 
as D. Though we do not define rearrangements precisely here, it suffices to note that such 
rearrangements always exist for measurable /, and whenever /* is a rearrangement of /, we 
have ||/|U = li/IU and ||/|| a = ||f ||i and f D A(\ Vf{v)\)dr, = J D , A(\ VrivWv- Thus, if 
/ violates the theorem conclusion for a particular choice of C, 5, and e, then /* violates the 
statement for the same values. 

It is therefore sufficient to prove the result for positive, radially decreasing spherically 
symmetric functions f(rj) = g{\i]\) — defined on a sphere D' of radius R) with g : (0, R] i— > 
[0, oo) decreasing. Thus, the theorem is implied by (the K — 1 case of) the following 
assertion: For every e > and K > and C > there exists a 6 > such that 

1. f^{A(\g , (r)\)drjr d - 1 )dr<C 

2. tfdgWdTjrt-^drKS 

together imply (a dT d r d - l ^j dr < K. 

Here is the volume of the unit sphere of radius 1 in d dimensions (and dTj the n — 1- 
dimensional volume of its boundary). Replacing C, 5, and K with their values divided by 
dT d , we eliminate that constant: it is now enough for us to prove that for any e > 0, C > 0, 
and K > there exists 5 > such that 

1. f* Adg'ir^r^dr < C 

2. f* \g{r)\r d ^dr < 5 

together imply A (^f^j r d ~ l dr < K. 

We say the theorem "holds for (R, C, K, e)" , if for those fixed values, the above implication 
is true for some sufficiently small 5. Let a > and j3 > be arbitrary constants. We now 
show that to prove that the theorem holds for all positive (R,C,K,e), it is sufficient to 
prove that the theorem holds for all positive (R, C, K, e) for which R < a; furthermore, it is 
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enough to prove the implication for the positive, decreasing functions g on (0, R] for which 
\9{R)\ < fl- To just these two reductions, first suppose R > a but < R' < a. Since g 
is decreasing, we have \g(r)\ < |<7(-R')I f° r r > R' an d 5 > f | (?(r) |r d— 1 c/r > (R') d g(R'). 
We conclude that for any fixed value of R', we can assume (when 5 is small enough) that 
g(R') < (3. Moreover, 




For fixed R', R, and A, the latter term clearly tends to zero as 5 tends to zero. Thus, it 
is enough for us to bound A(^p-)r d ~ 1 dr; and we can do this by solving the modified 
problem with R — R' < a. 

Thus, we may assume R < | throughout the remainder of the proof. Now, putting 
t — R — r, and noting \<\ (since R < |), we can write 

Applying Jensen's inequality repeatedly (see explanation below), we conclude that 

A(^+y*A(\g'(u)\)du. 

The second inequality uses the fact that A is convex and A(0) = along with the bound 
| < 1/2 for the first term and a second application of Jensen's inequality to the probability 
measure d(u/t) on (r,R) for the second term. We conclude that 




Now we integrate both sides of the inequality, using Fubini's theorem on the second term of 
the right hand side to change order of integration. We then get the result 

£ A ^^T) r d-l dr < j\ d ~ l A dT+ \l* Sr rd " lA dUdT = 
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Since u d < Ru d 1 we have the bound (from the modified first assumption in the lemma 
statement) 



R 







, fg(r)\ d i , „ fg(R)\ R d CR 



For any fixed A, e, and C, we can assume a and (3 are small enough so that the latter 
expression is less than one (since R < a and g(R) < /3). I 

5.3.2 Interpolations from d — 1-dimensional L 1 and L l y4 bounds 

Let G C IR d_1 be the d — 1-dimensional unit cube. Suppose we are given a linear function 
f u : M d i— > R defined by / u (?7) = (w, 77) (where u G M d ), and a function / defined on G. 

For any e, we can construct an interpolation between f u and /onGx [0, e] as follows. 
Then write 

g(v) = -ffuim, m, ■ ■ ■ > in-i, e) + e Vn f(vi, m,---, Vn-i)- 

The following lemma implies that, under appropriate conditions, we can make e small and 
still maintain an upper bound on the "average energy" of the interpolation function g over 
G x [0,e]. 

Lemma 5.3.2 Fix a constant C > 0. For every e > ; there exists a 5 > so that the 
following three statements 

1. A{u) < C 

2. J G A(Vf( V ))d v <C 

3- I |/i(77) ||i < where h (v) = fu(Vh ■ ■ -,Vn-i,0) - /(771, . . . ,?7n-i) 
together imply that i /r Gxe ] A(g(r)))dr] < AC . 

Proof If g Vl , . . . , g Vd are the partial derivatives of g, we can write 

-f A(Vg( V ))dr ] = V- f A(Vg Vi (v))dV- 

e JGx[0,e] i=1 e iGx[0,e] 

Since the first d — 1 derivatives are weighted averages of derivatives of / and f u , the sum of 
the first d - 1 terms is bounded by J Q A(Vf(rj))dr] + J2tx M u i) <C + C = 2C. The dth 
term is given by 

Write B(rj) = For any S > 0, Lemma 15.3.11 implies that for 5 small enough we 

have I l/l |b < S . Note that ||w n ||s,G < 1/2 (where u = (u±, . . . ,u n ), and we treat u n as a 
constant function). If 5 is small enough so that ||-||b,g < \-> then || - + M n ||s < 1 and hence 

J G A^ + u n ^dr l <2C. I 
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Now, suppose that G±, . . . , G 2 d are the 2d faces of the ci-dimensional unit cube D <Z M. d 
and fix some e < 1/2. Let D e be the cube of side length 1 — 2e, positioned in M. d so as to be 
concentric with D. Given / : D \— > M, we can interpolate between / and f u as follows. 

For 1 << 2d, let G\ be the face obtained by shifting G{ by e units in a perpendicular 
direction, so that it borders D' . Let g^ be the linear interpolation between between / (on 
Gi) and f u (on G'i), similar to the linear interpolation on G x [0, e] described above; we can 
extend g t to all of D by writing giijf) = f u {rj) whenever 77 and Gi are on opposite sides of G\. 

Now, we define an inner interpolation junction 



Since f{rj) = gi{i]), for some %, for all 77 on the boundary of D, we have g(t]) = f(rj) 
on the boundary of D. Moreover, since all of the g^s are equal to /„ inside D' , we have 
g{rj) = f u (r)) for 77 e D' . Similarly, we define an outer interpolation function by letting g[ 
be the interpolation between f u (on Gi) and / (on G\ and the portion of D that lies on the 
opposite side of G\ from Gi). Then 



This go(v) 1S equal to / u (^) on the boundary of D and f(rj) inside of D' . The following 
lemma gives energy bounds on go(v) an d 9i(v)- 

Lemma 5.3.3 Let G be the outer surface of a unit cube D C M d . Fix a constant C > 0. 
For every e > 0, there exists a 5 > /or which the following four statements 




infi<i<2d{oi(^)}, /(?7) < inf i<i<2dgi(il) 
^Pi<i<2di9i(v)}, f(v) > ^Pi<i<2d 9i{v) 



otherwise. 




infi< i < 2( i{fl , i(^)}, fu(v) < mii<i<2dgi{r)) 
^Pi<i<2di9i(v)}, fuiv) > ^Pl<i<2d9i{v) 



otherwise. 



3. 



2. 



1 




4 



11/ - fu\\l,dDS 



together imply 




D\D> 



A( gi (r)))dr) < (4C)(2d + 1) = 8dC + AC 



and the analogous statement for go- 
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Proof The main observation is that Vgiijf) < max{f(rj), V (71(77), Vg2ijf) 1 . . . , Vg2d(v)} f° r 
almost all rj (and the same is true for go)- Arguments similar to Lemma produce a bound 
of 4C on \D\D'\~ 1 f D \ D , A(gi(rj))drj for each i, and the result follows. I 

We can actually derive a similar version of the above lemma in which the bound on 
J dD A(V(f(r]))dri is omitted. Define g] ,e as follows. First, for some < 7 < 1, we define 
f 7 D h. 1 by f-y(v) = Tjlinv)- (Assume here that D is centered at the origin; so f 1 is 
essentially "zooming in" on the portion of / defined on the cube -D 7 of side length 7 that is 
concentric with D.) The construct the interpolations gi and go (as described above) using 
/ 7 instead of /. Then we "zoom back out" by writing 

[7(77) otherwise. 

Now we have: 

Lemma 5.3.4 Let D be the unit cube. There exists a constant Cq > for which the following 
is true. Fix a constant C > 0. For all sufficiently small e > 0, there exists a 5 > for which 
the following three statements 

1. A{u) < C 

2. J D \D\- l A(Vf( V ))dr]<C 

3. ||/-/«||m3<<$ 

together imply that for some 7 with 1 — e < 7 < 1, 

[ A( j gl' e (7 l ))dT l <C C\D\D 1 _ 2e \. 

In particular, the average value of A(V g]' e ) , over the region where it fails to be equal to f u , is 
bounded above by CqC . A similar statement is true for g'Jf, where the roles of f u and f are 
reversed. Also, the conclusion of the lemma remains true if we replace D by \F> , provided 
we 5 with . 

Proof Since f D A(V f(r)))dr] < C, then we must have J dD A(V(f(r]))dr) < 2C/e for a set of 
7 values in [1 — e, 1] with measure at least e/2. Since ||/ — / u ||i < 5, then we must have 

f \\f-fu\\i<25/e 

JdD-y 

for at least one of the 7 values for which j dD A(^(f(rj))drj < 2C/e. For e sufficiently 
small, the volume of -D 7 \-D 7 (i_ e ) is bounded above by 3de. The result then follows by 
applying Lemma 15. H.HI to / 7 (as defined above). The argument for g^ is similar. The 
analogous statement for -D follows immediately from the rescaling f n : -D 1— > R given by 
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5.3.3 Discrete interpolation lemma 

The proof of the following discrete analog of Lemma 15.3.41 is virtually identical to the proof 
of its continuous counterpart. Let A n = [0, n] d — v, where v G Z d is the vector with all 
components equal to \n/2\. If a Z, write A a = Auj. 

Lemma 5.3.5 Fix a constant C > 0. There exists a constant Cq such that the following is 
true: For all sufficiently small e > 0, there exists a 5 > such that for all sufficiently large 
n and any <fi : A n t— > M., the following three statements 

1. A{u) < C 

2. (<f>) < C|A n | (where $ has nearest neighbor gradient potentials determined by A) 
3- E x eA n \<t>(x)-M\<6\A n \ 

together imply that there exists an inside interpolation function ipi for which 

1. ipi(x) = 4>(x) for all x G dA n . 

2. ipi(x) = (u,x) for all x contained in An_ e ) n . 

The same criteria imply the existence of an outside interpolation function ipo '■ A n h- > R for 
which 

1. ipoi.%) = (u,x) for all x G dA n . 

2. ipo{%) — 4>{ x ) f or Q tt x contained in A(i_ e ) n . 
3- Hl\ A{1 _J^o)<C Cen d . 

5.4 Approximation by "mostly linear" functions 

In this section, we will see that every / G L 1,A (D) can be approximated by a function which 
"mostly agrees" with one of the linear approximators F n (defined in the following lemma) and 
whose total energy outside of the areas on which it agrees with F n is small. This construction 
will be useful to us in Chapter [7| 

Lemma 5.4.1 Fix f G L 1,l (D). Let F n (rj) be the piecewise linear (linear on simplices) 
function (not continuous in general) whose mean and mean gradient are equal to those of f 
on each simplex of ^Z d . Then 

lim ||V/ - VF n ||i = 0. 
Proof This merely says V/ can be approximated by step functions. I 



72 



Lemma 5.4.2 For any f G L^\D), \\f - F n ||i = o(±). 

Proof This follows from applying Lemma [5.4. II and Theorem 15. 1 .HI to / — F n . I 

Lemma 5.4.3 Suppose f e L 1,A (D) and fix e > 0. Then for all n sufficiently large, there 
exists a function F 6 6 L 1,A (D) that is equal to F n on a closed subset D' of D, where the 
volume of D\D' is less than e and j D _ D , A(VF e ) < e. 

f -AfV f(ri))dn 

Proof First, for any eo > 0, choose C > 2— — eQ — . For any 5 > 0, the fraction of boxes 
b (of side length 1/n) that satisfy 1 1/ — F n \ < ^Jr+r tends to zero in n. It follows that for 
sufficiently large n and any 5, the three conditions of Lemma 15.3.41 will apply to at least a 
1 — e fraction of the boxes of the mesh We can now define our interpolation function 
F e to be equal to / on the boxes for which the conditions of Lemma 15.3.41 do not apply and 
equal to the interpolation described by Lemma [5.3.41 on the boxes on which they do apply. 
It is clear from the lemma that if we take a sufficiently small eo < e, we can also arrange to 
have J D _ D , A(VF e ) < e. I 

In Chapter [7J given a function / defined on D, and good approximation D n of /, we 
will sometimes use a very naive approach (described in the following lemma) for defining 
functions n : D n \— > R which (appropriately rescaled) approximate /. 

Lemma 5.4.4 Fix: f : D \— > K and suppose that j D A(V f(r]))dr] is finite. Let the se- 
quence D n be a good approximation of D. Then the sequence of functions 4> n : D n \— > R 
(approximating f) defined by <p n (y) = n d+l f {v .\_ nrji=y} f(v)dv satisfies lAj- 1 ^ (</>„) < 
Co f D A(Vf(r)))dr) for some Co independent of n and f. 

Proof Write s n (r)) for the cube {77 : \nr]\ = y}. Because D n is a good approximation, all of 
the cubes s n (rj), for 77 6 D n , are subsets of D. Note further that 

Mv) ~ MV + e») = n d+1 [ ^-f(v + tei))dtdr). 

Js n (ri) Jt=0 VVi 

By Jensen's inequality (and the fact that n d+1 J s , , J^ n dtdrj = 1), we have 

A(MV) - MV + e*)) < n d+1 f /"" A (^f(v)) dij. 
The result follows by summing these integrals over all edges of D n . I 
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Chapter 6 

Limit equalities and the variational 
principle 

The first goal of this chapter is to derive some equivalent definitions of specific free energy 
(making use of the notion of "empirical measure" of a configuration : A n i— ► E) and an 
equivalent definition of the surface tension. In Section 16.11 we prove the equivalence of these 
definitions in the simplest setting: when $ is an SAP and (E, £) is R, endowed with the 
Borel a-algebra. In Section lfi~21 we describe some (relatively minor) technical modifications 
to the arguments of Section IB~TI that enable us to extend these results to all perturbed SAPs, 
as well as to higher dimensional real and discrete models (E = R m or E = Z m ). 

The second goal of the chapter is to prove the second half of the variational principle — 
namely, that every ^-invariant gradient phase \x of slope u G has minimal specific free 
energy among ^-invariant measures of slope u and hence satisfies SFE(fi) = o(u). In 
Section lfi~31 we prove this when $ = <3>y is an ISAP and (E, £) is R, endowed with the Borel 
a-algebra. In Section l6~4*l we describe the (again, relatively minor) technical modifications 
to the arguments of Section RHU that enable us to extend them, for E = R m or E = Z m , to 
all perturbed ISAP's and perturbed LSAP's. (It is not known whether the second half of 
the variational principle — like the first half — holds for all perturbed SAPs.) 

6.1 Limit equalities: PBL(fi), FBL(fi), and SFE(fi) 

Throughout this section, we will assume that E — R and $ is an SAP. Denote by A the 
topology of local convergence on !P(f2, 3 rr ) and by 23 the basis for that topology consisting 
of finite intersections of sets of the form {/z : fi(F) < e} where < e < 1 and F : Q — > R is 
^-measurable for some A CC Z d , and F is bounded between and 1. 

For notational convenience, throughout the following two chapters, we will also augment 
the space Q to be the space of functions from Z d to RU {oo} instead of merely from Z d to R. 
Whenever we deal with a real-valued function that we have defined only on a subset A of 
Z d , we tacitly assume that it extends to a function in Q for which (f)(x) = oo for x A. (So 
in this context, the statement <fi(x) = oo is a way of saying that (f>(x) has not been defined.) 
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Given u G M. d , let cj) u : 7L d i— > K be a function with an ^-periodic gradient and e a positive 
constant for which 

1. (p u has asymptotic slope given by u. 

2. There exists a finite constant C such that for any edge (x, y) in Z d and any bounded 
between U — 2e and U + 2e, we have: V x<y ((f>(y) — <$>( x )) < C. 

Arguments similar to those in Lemma 14.2.21 and Lemma 14.3.21 imply that — whenever $ 
is an SAP and u G Uq> — there exists at least one such pair (<f> u ,e). As in previous chapters, 
we take A n = [0, kn — l] d where k is chosen so that kZ d C £. Let C% be the subset of maps 
: A n i— > M. for which |[0(x) — (fi(x )} — [<f> u (x) — <j>u{ x o)]\ < e f° r & U x £ <9A n \{£ }- (Here, 
Xo is the origin, so it lies on one corner of the boundary of A n = [0, kn — l] d .) The functions 
in are those functions which (up to additive constant) closely approximate the periodic 
function <p u on the boundary of A n . 

Given a B G B, let _B n be the set of functions (f) : A ra > R whose empirical measures lie 
in 5. To say this precisely, let ^ denote translation by x 6 £. Define L n ((j)), a measure on 
(fi,J),by 

L n {4>) = \AnHL\- 1 <W- 

xeA n DL 

Then write B n = {0|L n (0) G -B}. In the following definitions, we will assume that u G U 
and that <p u and e are fixed. After we prove Theorem 16. 1. 1| it will be clear that the definition 
of PBL is independent of the particular choice of 4> u and e. We use the initials PBL to 
mean "pinned boundary limit" and write 

PBL U M = hmsup -\An\~ 1 log ( / lc^e"^ J] #(: 

\ lGA„\{l|)} 

In the integrals in the above limiting sequence, we assume 0(xo) is set to and <j)(x) = oo 
for x G" A n . We also write 

PBL u {u) = sup PBL u B {n) 

PBL(fi) = PBL s ^(fi). 

Ifu^U, then write PBL u (fi) = oo; in particular, if S(/x) G" Z7, then PBL(p) = oo. We now 
define the "free boundary limit" as follows: 

FBL b (jj.) = liminf -(A^- 1 log I / l Bn e- H W> J] d<j>{x) I 

V rreA„\{z } / 

FBL(fi) = sup FBL B (fi) 
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Theorem 6.1.1 If $ is an SAP, E = R, and /x G ^(Q, 3 T ), then 

SFE(fi) = FBL(fi) = PBL(fi). 

It is obvious that FBL{n) < PBL u (n) for any u; in particular FBL(fi) < PBL(p). In 
the following three subsections, we will prove Theorem 16.1.11 in three steps: 

1. PBL(fi) < SFE(fi) when /x is ergodic 

2. PBL(fj) < SFE(fi) for any /x 

3. SFE(fi) < FBL(fi) 

6.1.1 PBL{fi) < SFE{fi) for // ergodic 

First, as a simple consequence of the ergodic theorem, we show that if the gradient of <fi is 
chosen from an £-ergodic gradient measure /x with finite slope, then <fr closely approximates 
a plane, in an L 1 sense, with high probability. Precisely, we show the following: 

Lemma 6.1.2 let /x G ^(fi, 3 rT ) be L-ergodic with finite slope u G W 1 . Then for any fixed 
e, we have 

lim ^{\A n \- X Y] \(j){x) - (x,u) - A J > en d+1 } = 0, 

n— >oo ' ' 

x£A n 

where 4>a„ is the mean value of (f>(x) — (x,u) on A n . 

Proof First, we may assume without loss of generality that when is chosen (defined up to 
additive constant) from /x, we have u = and fi(cj)(y) — <f>(x)) = for each x, y G Z d . If this 
is not the case, we can let / be a function (determined from /x up to additive constant) for 
which f(y) — f(x) = n{4>{y) — 4>{ x )) f° r eac h x i V e an d then replace /x by /x — /. Thus, 
we need only show in this case that for e > 0, we have 

lim MIAnl" 1 J2 \^ x ) - ^a„| > en d+1 } = 0, 
xeA n 

where 0a„ is the mean value of <fi on A n . 

When is chosen from /x, the classical ergodic theorem (see, e.g., Chapter 14 of [23] ) 
states that for any 7 > 

hm /x{| '—^ 1 > 7} = 0. 

n^oo |A n | 

A trivial corollary of the ergodic theorem states that if h : D — > IR d is any continuous, 
bounded function on the unit cube D, then 

hm fi{\ — 1 > 7} = 0. 

n-*oo |A n | 
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Now, write A n = (nk)e% + A n , where e\ is a basis vector of Z d . So A n and A n are adjacent 
blocks. A discrete integration by parts gives 

E m - X) = w* 1 E [^)-0(x- ei )]/i(x/n), 

s/eA„ xeA„ xeA„uA„ 

where h(v) = V\ for v = (vi, ■ ■ ■ ,v d ) G D and h(v) — 2 — v\ for v G -D + e\. It follows that 
for any 7, the probability that the mean value of <fi on A n differs from the mean value on A n 
by more than / yn d+1 tends to zero in n. 

Similarly, if we take a large integer c and form a (ckn) d cube by joining c d translated A n 
blocks, then for any fixed 7, the probability that the mean value on any one of these blocks 
differs from the value on any other by 7n d+1 will tend to zero in n. 

Let C be supremum of /i(|V0(:c)|) for x € Z d . The probability that the mean value of 
|V(0)| on one of these blocks is greater than 2C also tends to zero in n. It follows from the 
first part of Theorem 15 . 1 . 31 and Corollary 15.2.21 that for some Co (independent of 8) we have, 
for any 60, 

lim //{lA^r 1 J2 \<P(x) - An | > C n d+1 } = 0. 

xeA n 

Now, taking A" = cn, the above claims imply that (when A" is restricted to multiples of 

c) 

lim ^{\K N \~ l V \4>{x) -<Pa n \> C c d n d+1 = (C /c)N d+1 } = 0. 

N—*co ' 4 

It is not hard to see (by applying the same result and restricting to a slightly smaller 
box) that the above remains true if A" is not restricted to multiples of c. Since the above is 
true for any c, we have 

lim MIA^r 1 V \</>(x) -<Pa n \> eN d+1 } = 0, 

for all e > 0. I 

A simple corollary is the following: 

Lemma 6.1.3 Let \i 6 7^(0,, 3 rT ) be L-ergodic with finite slope u 6 K d and finite specific 
free energy. Then for any fixed e, 

lim {i{\\ n \- 1 V \<j>{x) - (x,u) - 0(O)| > m d+l } = 0. 

n^oo ' J 

Proof To deduce this from Lemma 16.1.21 we need only show that the probability that 
4>(xo) — (xq,u) differs from (An)" 1 J2 x &A n ~ ( x i u ) 011 A„ by more than en tends to zero 
in n. By shift-invariance, we can show equivalently that if we choose uniformly an x in A 2n , 
then the probability that f(x) = <p(x) — (x, u) differs from g(x) = |A n | _1 J2 y ee x A n ^(v) ~~ {v-> u ) 
by en tends to zero in n. However, an immediate consequence of Lemma 16.1.21 is that the 
probability that either f\x) or g(x) differs from |A 2n | _1 J2 y eA 2n ^(^) ~ (.V ~ u ) by m tends 
to zero in n. I 
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Now, we continue with the proof that PBL(p) < SFE(p) when p is £-ergodic and 
S(p) = u G U. Recall that SFE(p) = lim^oo {Anl^FE^^p) and 

PBL(p) = sup limsuplA^FEA^), 

B3fi,Be'B n->oo 

where v% is the Gibbs measure 

xeA n \{xo} 

and Z is chosen to make the above a probability measure. Let p n be the restriction of p to 
$\ ■ It is now enough for us to show that for each B 3 p, B G S, we have FE^ n {u^) < 
FE\ n (p n ) + o(|A n |). Since v% has minimal free energy among measures on (f2, 9^„) that 
are supported on A n R B n , it will be sufficient to generate a measure p' n on (fi, 3^ ) — also 
supported on A n nB n for sufficiently large n — which satisfies FE\ n (p' n ) < FE^ n (p n )+o(\A n \). 

We take B to be the set of measures n on (f2, J" 1 ") for which |vr(Fj) — p(Fi)\ < e, for some 
finite sequence of cylinder functions Fi : Q —>■ [0, 1] and some constant e. Write B' to be 
the set of measures n for which |vr(Fj) — //(i^)! < e/2 for each i. Now, /i n is not necessarily 
supported on A n n -B n . However, the ergodic theorem implies that the probability that a 
sample from \i n lies in B' n C -B n tends to one as n tends to oo. 

To sample from we will first sample from fi n (conditioned on lying in B' n ), and 
then use a "random truncation" to alter in a way that forces it to lie in A n D B n . In a 
separate step, we will check that FE^ n (n' n ) < FE^ n {jj, n ) + o(|A n |) by showing that these 
random truncations change the free energy by at most o(|A n |). 

Let D be the unit cube [0, l] d and let / : D ->• E be the piecewise-linear "pyramid- 
shaped" function for which f(z) = for z G /(^o) = 1 where Zq is the center point of 
D, and / is linear on each line segment connecting z to a point in 3D. Now, given e > 0, 
we define a pyramid-shaped, e-sloped function on A n by p n {x) = 0„(x) + n5f(-r-). If e is as 
described in the definition of <p u , then p n will have the property that V X) y((f>(y) — <j>(x)) < C 
whenever x,y G A n , |a; - y| = 1 and |(0(y) - <f>{x)) - (p n (y) - p n (x))\ < e. 

Now, we first define an "upper truncated" measure [i!' n as follows: to sample from p!^, first 
choose from p, taking the additive constant so that 0(x o ) = 0. Let A + be the set of vertices 
x G A n for which 0(x) > p n (x) and let A~ the set of x G A n for which 0(x) < — Then 
for each x G A + , we re-sample 4>(x) from 

z_ll {Pn(xO<^)<p„(x)+ e }exp V x>y ((j)(y) - (j)(x)) \d<p(x), 

\yeA n \A+ ,\x-y\=l / 

where Z is the appropriate normalizing constant. 

Now, we claim that FE\ n (p'£) < FE\ n (p n ) + o(|A„|). To see this, note that by Lemma 
12.1.31 we can compute each relative entropy in stages: to sample from p n or p!' n , we can 
first choose A + , then choose the values of conditioned on A + . The relative entropy is the 
relative entropy of the first random process (the choice of A + ) plus the expected relative 
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Figure 6.1: Shown here are U (dark lines); <p u ± enf(nx) and <f) u (x) ± (enf(nx) + e) (dotted 
lines); and before and after a random truncation (where d = 1). 

entropy of the second; similarly, the relative entropy of the choice of (conditioned on A + ) 
can be written as the relative entropy of the choice of for x G" A + plus the expected 
relative entropy of the choice of <j)(x) for x G A + . Since the latter step is the only one which 
is different in the two process, the difference in the two relative entropies depends only on 
the expected difference in the relative entropy of the last two choices. For a given choice of 
A + and (defined for x G" A + ), write 

xeA+ 

where Z makes the above a probability a measure; let 7r )(resp., tt") be the conditional 
distribution of ji n (resp. /i") conditioned on A + and (for x G" A + ). Since, by Corollary 
16.1.31 the expected size of \A + \ is o(|A n |), it is enough for us to show that < K{j" ) t{q) < 
J-C(n, 7r ) + c\A + \ for some constant c. Since JC(7r,7r ) is positive, it is enough to show 
JC(7r", ttq) < c\A + \. The reader may check this last fact, again, by writing the relative entropy 
as a sum of a sequence of expected conditional relative entropies, one for each x G A + . The 
key observation is that when one chooses <fi{x) — and <ft(y) have already been chosen for 
some nonempty subset S x of the neighbors y of x, and each (p(y) < 4> u (y) + enf(y/n), then 
the measure e~^»£S« vmv) ~^ x))s{y)) lct>{x)<<t> u {x)+en}{y/n)d(t){x) (where s(y) is 1 if x precedes y 
lexicographically, and —1 otherwise) obtains a bounded fraction of its mass in the interval 
[4> u ( x ) + en f{y/ n ), 4>u(%) + €nf(y/n) + 1], independently of the precise values of 4>(y). 

To sample from fi' n , first sample from fi'^, then apply a "lower truncation" (analogous 
to the upper truncation described above, using A~ instead of A + ) and condition on the 
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output lying in B n . Since, by the ergodic theorem, the probability of the latter event tends 
to one, we have FE^ n (fi' n ) < FE^ n {ji n ) + o(|A n |), as desired. 



6.1.2 PBL(n) < SFE(fi) for general \i 

Since SFE is affine, it is enough to show that PBL(fi) is "strongly convex" — i.e., that when 
ji = f uw^^du, we have PBL(n) < J P BL{y)w p{y)dv . 

First we will show that PBL is convex. Suppose that fii and /X2 have slopes u\ and «2 
respectively, < a < 1, /i = arii + (1 — a)^), and u = au\ + (1 — a)u2- It is enough to show 
that PBL u b il is less than or equal to PBL Ul (rn) + PBL U2 (li 2 ) for each B 3 /i,B <E e S>. We 
may assume B is a finite intersection of sets of the form {n : \7i(Fj) — ^{Fj)\ < e}, where 
each Fj : Q —* [0, 1] is a cylinder function. For i e {1, 2}, take B % to be the intersection of 
the sets {n : \n(Fj) — jii{Fj)\ < e/2}. By the definition of PBL, it is now enough to show 
that PBL^fi is less than or equal to aPBL^^ii) + (1 — a)PBL u ^ 2 {^2)- And to prove this, 
it is enough to us to show that for any fixed n, letting M get large, we have 




< 



n 

x£A„\{x } 

xeA n \{x } 

(l-aJIAJ^log [ / l An nB*e- H W J] dcf>{x) j + o(l). 

Next, roughly speaking, we would like to combine <p Ul and <p U2 to form a "washboard" 
function of slope u whose gradient agrees with that of <p ui on an a fraction of the points and 
that of (p U2 on a (1 — a) fraction of points. (See Figure \S72\ ) 

Write pi = (1 — a) and p 2 = a. Fix some large integer iV and consider the layered 
sequence of surfaces 0^ . (x) = (p Ui (x) + PiNj (for j 6 Z); let 4> Ui (x, rf) give the index j of the 
lowest layer which lies beneath the point [x,r]), i.e., the smallest j for which 0£.(x) < f]- 

Now, write ip(x) = mi{r]\(p Ul (x, rf) + (p U2 (x, rf) < 0. Now, fix n and tile Z d with a sequence 
Aj of cubes of side-length n. It is not hard to see, for fixed n, that if) has asymptotic slope 
given by u and that the fraction of cubes on which the gradient of if) fails to agree with that 
of either <pi or 2 throughout the cube is o(-^). Moreover, if e is the minimum of the e values 
in the definitions of <fr Ul and (f> U2 , then the energy at any edge of <f> will be less than or equal 
to C whenever (f> is sampled from the box measure of radius e centered at if). 

Define v n to be the measure lA n nB n e H ^ n Y[ x eA n \{x } d(f>{x)] define v % n analogously using 
B l n instead of B. We can restate our goal as follows: we need to prove that as TV and M get 

IAmI^FE^m) < alA^FEiul) + (1 - a^A^FE^) + o(l). 
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Figure 6.2: A "washboard" shaped surface ip when d = 2. The shaded regions have slope U\ 
and the unshaded regions have slope u^. 

Since vu has minimal free energy among measures supported on Am H Bm, it is enough to 
generate some measure v' M supported on A M C\B M for which the analogous expression holds. 

Now, we define a measure v' M on (Q, 9^) as follows. To sample from v' M , first choose 
from the radius-e box measure centered at ip on A. Then, for all cubes Aj on which the 
gradient of ijj is identically equal to that of either <j> Ul or <p U2 (say we re-sample from 
z/ (when A is translated to coincide with A n for the purposes of the definition). Letting N 
and M get large in such a way that M/N — > oo, we see that 

IAmI^F^m) < a|A n |" 1 F J E;(^) + (1 - a^A^FE^) + o(l). 

Moreover, if we modify v' M by adding a truncation (as in the previous section) this modifi- 
cation will also change the normalized free energy by at most o(l), and the result follows. 

It remains to check that PBL(fi) is "strongly convex" — i.e., that when [i = J vw^iy^dv^ 
we have 

PBL(fx) < J PBL[y) Wll {y)dv. 

However, it is obvious from the definition of PBL that it is lower semi-continuous in the 
topology of local convergence. So it is enough to observe that we can approximate /i in this 
topology by a sequence of slope-w weighted averages of the form fi k = J2 i=1 cn,k^i,k where 
< di < 1, the Hi are ergodic, the /i fc converge to // in the topology of local convergence, 
and limsup£F#(/i fe ) < SFE(fi). 

If one samples a sequence /ij of ergodic components independently from w^, then the law 
of large numbers implies that v n — - Y17=i ^ converges to /i in the topology of weak local 
convergence and that SFE^Vn) — > SFE(fi) almost surely. The desired approximation is now 
easily obtained by altering the coefficients of the v n slightly (so that the slope is exactly u 
instead of approximately v). 
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6.1.3 Empirical measure argument: SFE(fi) < FBL(/i) 
Recall the definition 

FBLb(h) = liminf-IA^ 1 / l Bn e~^ wn ^An\{*o}d0( x ). 

We can normalize the measure l Bn e~ H ^^ xeAn \^ x o) d(p(x) to produce a probability mea- 
sure on the set of functions from A n \{a;o} to K. We can extend this to a probability measure 
z/g on fl in which the events 4>(xq) = and <fr(x = oo),x ^ A n have probability one. Let 
A™ be the subset of A n containing vertices which are at least m units in distance from the 
boundary of A n . Now, let m(n) be some function of n for which m(n) tends monotonically 
to oo in n but m(n) = o(n). Let [i B = |A™ fl kZ d \~ 1 X] 3/e A m(n) nfcz d ®v v b- 

Lemma 6.1.4 As n tends to infinity along an increasing sequence for which 

-\K\~ l j l Bn e~ H °^ m ^\^d(t>{x) ^FBL B {fi), 

at least one subsequential limit [i B of the measures fi B exists; any such limit satisfies 

SFE(hb) < FBL B {p). 

Proof Fix any integer r > 1 and suppose that Ai,...Aj are disjoint translations of A r 
contained in A n . Then Lemma (2.3. II implies that 

j 

FEM) > Y, FE ^ + c A «\ u '=i A *l +j) . 



i=l 

where c is a fixed constant — namely, the minimum possible value of FE e (fi) where e is a 
single edge. Now, if n is large enough so that m{n) > r, then fi n , restricted to A r , is a 
weighted average of the restrictions of v n to translations Aj., . . . , A( n _ m ( n )w of A r . We can 
divide {Aj} into r d = \A r fl kZ d \ sub-collections according to the values of the components 
of their lexicographically minimal corners modulo rk; each such sub-collection consists of 
disjoint copies of A r . Using convexity of free energy for the first step, we may conclude: 

\KnkZ d \- l FE K (^B) <r~ d 

(n—m(n) d ) 

£ (n-m(n))- d FE A M% < 
i=i 

(1 - o(n))\A n n kZX'FEM) ~ o{n) + c(r" d ) 

This implies that 

limsup \Ar\~ 1 FE Ar (fi n ) < limsup \A n \- l FE An (v%) = FBL B {y). 
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By Lemma 12.1.51 this also implies that fi n restricted to A r has a subsequential limit. By a 
diagonalization argument, we can can choose a subsequence for which such a limit exists for 
every r; and in this case, and this subsequence converges to a measure /ig on (O, £F T ), which 
is easily seen to be ^-invariant and satisfy FE^^lb) < FBLb(^) — cr~ d . Since this is true 
for any r, we have SFE([ib) < FBLb(h). I 

Now, by Lemma [2.1.71 the set Mfbl{h) — {v '■ SFE(u) < FBL(fi)} is metrizable in the 
topology of weak local convergence. Hence, we can choose a sequence of sets B m G 2 in such 
a way that each is contained in the ball of radius 1/m about fi with respect to this metric. 
It is then clear that hb™ H', hence, by Lemma [6.1.41 and Theorem 12.4.21 

SFE(fj) < liminf SFE(fi B m) < limsup FBL B ™{jj) = FBL(/i). 



6.1.4 Alternate definition of a 

Theorem 16 . 1 . II also implies the following alternative definition of a: 
Corollary 6.1.5 If § is an SAP and E = R or E = Z, then 

a{u) = liminf - |T n | - 1 log Z Tn > SFEQjl), 



where Zt„ are as defined in Section Iflg and \i is the subsequential limit of the measures 
described in Lemma \^.2.b\ 

Proof Suppose E = R. By tiling T n with large cubes and considering measures obtained 
by taking a box measure centered at <p u outside of the cubes and then sampling the interiors 
according to a Gibbs measure, it is easy to see that for any \i with slope u, 

liminf -|T n | _1 log Z Tn < PBL(fi) = SFE(fi). 

Hence, 

a(u) > liminf -\T n \ _1 log Z Tn . 

n— >oo 

For the other direction, it is enough to construct a measure \i with 

SFE(ji) < liminf - |T n | - 1 log Z Tn , 

n^oo 

and we can choose a subsequential limit /j, of the torus measures which has this property. 
We will discuss the case E = Z in the next section. I 
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6.2 Limit equalities in other settings 



6.2.1 Discrete systems 

In this section, we describe the modifications to the proof of Theorem 16 . 1 . II necessary for the 
following discrete analog of the theorem. 

Theorem 6.2.1 If $ is an SAP, E = Z, and fi G 3> £j (n,3 rr ), then 

SFE(n) = FBL(fi) = PBLQi). 

If E = Z, then the only relevant values of V XtV : R i— > R are the values assumed at integers. 
Thus, we lose no generality in making the convenient assumption that each V XjV is linearly 
interpolated between integer values — that is, V x>y is linear on + 1) for each j G Z, lower 
semi-continuous, and convex. (Thus, V x>y is continuous and finite on some closed interval 
with i, j G Z U {— oo, oo}, and infinite elsewhere.) 

If : Z d — > R is any continuous configuration, then we can define a "randomly rounded" 
discrete configuration by <p e = \_<f) + ej where e is chosen uniformly from [0, 1). A key 
observation is that the expected value of V XiV ((f) e (y) — 4> e (x)) is equal to V Xjy (<p(y) — 4>(x)). 

Now, we take 4> u to be any real-valued function of slope u, with ^-periodic gradient, 
which has finite $ energy. Now, we can define FBL(fi) and PBL(n) precisely as in the 
continuous case except that when defining PBL(fi), we fix the boundary conditions by 
randomly rounding <p u . So PBLf (p,) = FE^ n {^„) where /i n is define as follows: to sample 
from /i n , first choose <fi on the boundary via a random rounding (f) e u of 4> u , where e is chosen 
uniformly in [0, 1); then choose (j) on the inside according to the Gibbs measure conditioned 
on the empirical measure lying in B (i.e., on <ft lying in B n , as defined above). We define 
FBL exactly as before. 

Except for this change in setup, all of the arguments and definitions are essentially 
identical to the continuous cases. As before, it is obvious that FBL(fi) < PBL(fi), and 
the argument that SFE(fi) < FBL(fi) is the same as before. In the proof that PBL(fi) < 
SFE(fi), however, there is a slightly difference in that we no longer need to define box 
measures, since singleton measures themselves have finite free energy, and the "random 
truncation" can be replaced by a non-random one. For the alternate definition of a, since we 
cannot necessarily cause the slope of a configuration on the torus T n to be exactly u, we have 
to round u to an integer vector multiple of 1/n (as described in Lemma f4.2.6j) . However, the 
remainder of the proof is the same as the continuous case. 

6.2.2 Higher dimensions and PSAPs 

Finally, in this section, we give the most general form of Theorem 16.1.11 

Theorem 6.2.2 If $ is a perturbed SAP, E = Z m or E = R m , and y, G then 

SFE(y) = FBL(y) = PBL(fi). 
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First, note that $ is an SAP (which we can write $ = 5-/£li^*> where each $j is a 
one-dimensional SAP), then the definitions of PBL and FBL are the same as those given 
at the beginning of Section WT\ except that <p u = YliLx 0L> Ai = 111^1 an d -Bn — Eliii -^n 
where the A™ and 5^ are subsets of functions from A n to M, defined as they would be if we 
were in the one-dimensional setting using <3>j and Ui (the d- dimensional slope determined by 
the ith row of u) instead of $. If $ + ^ is a perturbed SAP, where $ strictly dominates 
then we will also define A n and B n using $. 

Proof In both the m > 1 and perturbed settings, the arguments for SFE(fi) < FBL(fi) 
is exactly the same as before; in each setting, as in the simplest one- dimensional case, it is 
obvious from the definitions that FBL(fi) < PBL(fi). 

In the un-perturbed case when m > 1, the proof that PBL(fi) < SFE(ft) for ft ergodic 
is the same as in the one-dimensional case except that we apply a separate randomized 
truncation in each of the m coordinate directions. The generalization of this result to non- 
ergodic fi is exactly the same as the proof given for E — R. 

In the perturbed case where the potential is defined to be $ + \1/ (where $ is simply 
attractive and strictly dominates and \1/ has range k < oo), instead of defining A n to be 
the set for which the boundary values 4>(x)— 0(x o ) (for x G dA n ) are within e of (p u (x)— <p u (x ), 
we take A n to be the set in which \(4>(x) — 0(x o )) — (<f> u ( x ) — 4>u(xq))\ < e for all x G A n which 
are within k units of distance from dA n . Then we use the one-dimensional argument (just as 
before) to produce a random truncation in which the expected energy added each time we 
decide <fi{x) for another x G A + is finite; then we observe that since the expected combined $ 
energy that occurs in edges within k units of points in A + is o(|A n |), the expected combined 
\l> energy is o(|A n |) as well. I 

6.3 Variational principle 

We have already proved in Lemma 12.5.21 that if fi G T,c(f2, 3 rT ) has finite slope u and 
SFE(fi) = a(u), then /i must be a Gibbs measure. This, together with the converse, is 
called the variational principle. 

Theorem 6.3.1 Let $ be a perturbed ISAP (when E = M. m or U 11 ) or a perturbed LSAP 
(when E = Z m ). Then if fi is an ergodic gradient measure of finite slope u, then fi is a Gibbs 
measure if and only if SFE(fi) = a{u). If fi is a not-necessarily- ergodic gradient Gibbs 
measure with ergodic decomposition given by 

fi= Wfj,(u)du, 

Jex9 L (n,9 T ) 

then (letting S(u) denote the slope of v) fi is a Gibbs measure if and only if 

SFE(fi) = w^aiS*)) 
(using the abbreviation w^a^S*)) := j exy (n 3 . T) w fl a(S(i'))di'). 
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We will first observe here that the second statement follows from the first; we will then prove 
the first statement in case that E = K and $ = $y is an ISAP, delaying the more general 
discussion until Section E31 

Proof The second statement in the lemma follows from the first, using Lemma EIS31 and the 
fact that slope is e(exD 3 £ J (^ ! 9 rr )) measurable. For the first statement, by Lemma 12.5.21 we 
need only show that if fi is an ergodic Gibbs measure of slope u, then SFE(fi) < cx(/x). By 
Lemma fd 1.11 (noting that in the definition of PBL and any \i of slope u, the limit defining 
PBLg(fi) only gets smaller if we replace A n fl B n with A n — i.e., if we take B — Q), it is 
enough to show 

SFE(fi) < liminf-|A n r 1 log / e' 11 ^ TT d<p(x). 

JAn xeA n \{x } 

By Lemma I2.4.2[ if we can construct a sequence /i n of slope u measures with 



lim SFE(fM n ) ^liminf-^l^log / e~ H ^ TT d(f>(x), 

JAn xeA n \{x } 

and /i n converging weakly to /i, then it will follow that SFE(fi) < a{u). 

We define measures /x^ as follows: to sample from such a measure, first tile Z d with cubes 
of size A n ; on each cube, we will independently choose a configuration belonging to A n in 
the following way. First, sample <fi from fi and consider its restriction to A n ; we condition on 
the event that H An ((fi) < C|A n | where C is twice the specific V energy of fi (a value which is 
finite by Lemma f4. 1.81 and Lemma r2.3.8J) . The ergodic theorem implies the probability of this 
event tends to one as n tends to oo. Then, we let <fio be the the outside interpolation function 
whose existence is guaranteed by Lemma 15.3.51 and we re-sample the values in A ri \A( 1 _ e \ n 
according to the box measure (whose existence is guaranteed by Lemma f4.1.1U|) centered at 
0o- (Throughout this argument, we tacitly assume that (1 — e)n is rounded down to the 
nearest integer.) 

Now, we would like to show that for any 5, if e is small enough, and n large enough, 
then the specific free energy will be less S + SFE({i). To see this, we will prove the result 
for measures v e n whose specific free energies are clearly higher than those of the fi e n . To 
sample from z/^, first sample form n e n , and then on each box, let <pi be the interpolation 
between An_ e ) n and A( 1 _ e )2 n guaranteed by Lemma 15.3.51 then re-sample using the values in 
A(i-e)n\A(i-e) 2 n from the box measure whose existence is guaranteed by Lemma f4. 1 . 101 This 
measure is equal to the Gibbs measure on An_ e \ n in the blocks forming a (1 — e) 2 fraction 
of Z d , and outside, equal to a box measure with specific free energy bounded by a constant 
(again, by Lemma r4.1.10|) . Thus, for a sufficiently small e, we have that for sufficiently large 
n, SFEiy^) < SFE(fi) + 5. It is now trivial to check that /i is a limit point of the measures 



6.4 Variational principle in other settings 



6.4.1 Discrete models 

To prove Theorem 16.3.11 in the setting E = Z, we take V — V (as before, assuming V is 
linear between integers and lower semi-continuous) and construct continuous interpolations 
exactly as in the E — R setting. Our definition of fi e n and v e n is essentially the same as in 
the previous case. We tile by A n blocks just as before, and choose the interpolations just as 
before; the only difference is that instead of using a box measure, we use a random rounding: 
i.e., we add a variable uniformly distributed in [0, 1] to the entire choice of and then round 
down (in the set R consisting of points outside of the (1 — e) 2 boxes), after which we sample 
the remainder of <j) inside each of the boxes according to the appropriate Gibbs measure 
(with boundary conditions given by the values of in R). 

6.4.2 Lipschitz simply attractive potentials 

In the Lipschitz models — when $ is not necessarily an ISAP, and we may not have a good 
definition of V — we define ^ e n and u* slightly differently: in this case, we use a truncation of 
the kind used in the proof that PBL(fi) < SFE(fi) in Section lb. 1.11 Because [i is Lipschitz, 
Lemma 16. 1.21 actually implies that as n tends to infinity, the probability that <fi sampled 
from /i differs from the plane of slope S([i) (with appropriate additive constant) by more 
than en in the supremum norm tends to zero in n. Thus, as n gets large, the probability 
tends to one that truncations of the type in used in Lemma 16.1.11 will not affect the value 
of 4> anywhere inside the box An_ e ) n . These truncations play the role of the interpolations 
used in the non-Lipschitz setting. It is not hard to define an "inside truncation" in a similar 
fashion, to play the role of the inside interpolations. 

6.4.3 Higher dimensions and perturbed ISAPs and LSAPs 

The change to higher dimensions (without perturbations) is essentially trivial; we simply 
define the interpolations separately on each coordinate as before. The change to perturbed 
potentials $ + \1/ is also essentially trivial; we simply observe that if the expected $ energy 
in A n \A(!_ e )2 n is o(|A n |), the expected \1/ energy is o(|A n |) as well. 
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Chapter 7 

LDP for empirical measure profiles 



7.1 Empirical measure profiles and statement of LDP 

Let D be a bounded domain, and the sequence D n a good approximation of D. The aim 
of this chapter is to prove a large deviations principle (of speed n d ) for the behavior of 
a random gradient configuration <ft n on D n . Instead of merely considering the gradient 
empirical measures — as in the previous section — we will consider profiles, which also contain 
information about how the occurrence of various gradient events are distributed throughout 
D. We will also investigate behavior of the normalized functions -(j) n {nx) (interpolated 
to functions on D) and examine their large deviations behavior with respect to topologies 
induced by LP and Orlicz L A metrics. We can define an empirical profile measure R^, nt n £ 
?(D x fi) by 

Jd 

Informally, to sample a point (x, a) we choose x uniformly from D, and then take a = 0\ nx \(f) n . 
(As in the previous chapter, it is convenient to write <p n (x) = oo for x ^ D n .) Also, using n , 
we will define a function <p n by interpolating the function -(f> n (nx) to a continuous, piecewise 
linear (on simplices) function on the simplex domain corresponding to D n ; each such 0„ is a 
member of the space Lq (D) constructed in Section 15 .2. 21 for Corollary 15.2.41 where we take 
V to be any function increasing essentially more slowly that Va, where V is defined as in 
Section HUl and V d is the Sobolev conjugate as defined in Section IB. 1.41 

Write n n = Z~ x e~ Ho ^ X\ x ^r> n \{ XQ } d<p(x) where the Z n are normalizing constants chosen 

to make fi n probability measures. Let p n be the measure on X = T^D x Q) x L$ (D) induced 
by fi n and the map 4> n t— > (i?^, n n ,0 n ). We say a measure \i G y(D x Q) is L-invariant if 
/x(-,fi) is a Lebesgue measure on D and for any D' C D of positive Lebesgue measure, 
fi{D', •) is an ^-invariant measure on Q. Given any subset D' of D with positive Lebesgue 
measure, we can write S(fi(D', •)) for the slope of the measure n{D', ■)/ p(D' x Q) (we have 
normalized to make this a probability measure) times fi(D' x Q); the map D' i— > S(fi(D', •)) 
is a signed/ vector- valued measure on D. Let X be the topology on 7{D x Q) x L v (D) which 
is the product of (on the first coordinate) the smallest topology in which fi — > fi(D' x /) is 
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measurable for all rectangular subsets D' of D and bounded cylinder functions / and (on 
the second coordinate) the Lq (D) topology. We say an ISAP $ = $y is super-linear if 
lim^oo V (•)])/ 7] = oo; although many of our bounds still hold for ISAPs which are not super- 
linear, the full LDP, which we prove below, is false as stated when V is not super-linear. 

Theorem 7.1.1 If $ is a super-linear ISAP. The measures p n satisfy a large deviations 
principle with speed n d and rate function 

{SFE(p(D, ■)) — P($) p, is L-invariant and S(p(x, •)) = V/(x) 
as a distribution 
oo otherwise 

on the space 7{D xO)x L v (D) in the topology X described above. 

In the case that $ is an SAP and E = R, the uniqueness of the minimizer of the rate 
function I described above is an immediate consequence of the uniqueness of the gradient 
Gibbs measure of a given slope (Theorem I8.6.3|) and the strict convexity of a (Theorem 
18.6.21 — which in particular implies that a has a unique minimum). This will also imply 
uniqueness of the rate function in the presence of boundary conditions (see Section 17. 3. 2|) . 



7.2 Proof of LDP 

Recall that, in general, a sequence of measures p n on a topological space (X, X) is said to 
satisfy a large deviations principle with rate function I and speed n d if I : X — > [0, oo] is 
lower-semicontinuous and for all sets Bel, 

— inf I(x) < liminf n~ d logp n (B) < limsupn _d logp n (i?) < — inf_/(x). 

Let A be any basis of X. It is not hard to check (and is proved in [22], Theorem 4.1.11 and 
Lemma 1.2.18) that the large deviations principle is a consequence of the following three 
statements: 

1. Lower bound on probabilities: 

inf limsup — AogpJS) > — I(x). 

SeA;xes n d 

2. Upper bound on probabilities: 

inf liminf —\ogp n (S) < —I(x). 

SeA;xeS n d 

3. Exponential tightness: For every a < oo, there exists a compact set K a C X 

for which limsup n ^ 00 n _d logp n (A\ii' a ) < —a. 
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The above conditions also imply that all the level sets M a = {x : I(x) < a} are compact (by 
Lemma 1.2.18 of j22j). When the first and second statement hold, we say that the p n satisfy 
a weak large deviations principle with rate function I. When the third statement holds, we 
say that the p n are exponentially tight. In the following subsections, we will prove Theorem 
17.1.11 by checking each of these statements in turn. We will do this first for the simplest 
case — when <3> is an ISAP and E = R — and address generalizations in Section 1731 

7.2.1 Lower bounds on probabilities 

The lower bound follows almost immediately from Lemma 15.4. 31 and Lemma 15.4.41 and The- 
orem |J^n] — in particular, the fact that SFE(fi) = PBL(n). 

Suppose we are given /(/, /i) < oo. This implies that / £ L 1,A (.D), and hence, by Lemma 
15.4.31 for all n sufficiently large, there exists a function F e £ L 1,A (D) that is equal to a 
piecewise linear approximator F n (as defined prior to Lemma f5.4.3|) on a closed subset D' of 
D, where the volume of D\D' is less than e and j D _ D , A(VF e ) < e. 

Now, a basis set S C A centered at /, \i can be written as the set of pairs (g, v) for 
which 5(f,g) < 7 (for some 7 > 0, where 5 is the distance function described in Section 
15.2.21— rough lv. S(f,g) = ||/ - g>||y*) and \jj,(l D .Hi) - u(l D .Hi)\ < 7 for each of a finite set 
of cylinder functions Hi (each bounded between and 1) on Q and rectangular subsets Di 
of D. 

Assume without loss of generality (rescaling if necessary) that the volume of D is less 
than one. 

Now, given any F e , and a large n, we define a measure p, e>n on configurations on D n 
as follows: to sample from first compute the approximation F™ : D n — * R of F e 
guaranteed by Lemma 15.4.41 (with A = V) and fix <P(xq) = F^(xq) for some reference vertex 
Vq. Then sample 4>(x) from the box measure centered at n (the one whose existence is 
given by Lemma 14. 1.7(1 for all values of x G D n \{x } for which x does not lie on the interior 
of one of the linear regions of F e . The free energy of this process is, by Lemma 14.1.71 and 
Lemma 15.4.41 and the assumed bound on J D _ D , A(VF t ), at most a constant times en d . 

Then, inside each large box A n of D n — assume it has size — approximating a box A 
on which F e is linear, we sample from (appropriately translated) the Gibbs measure on 

A n fl G n where G is the set of measures v for which \v{HA — < 7/2 for each i. See 

Figure 17.11 

If e < 7/2 and n is sufficiently large, then it is not hard to see that n is supported inside 
the set S. Since the free energy from the box measure choice in the non-linear sections of /^ e>n 
is 0(en d ), and the free energy from the remaining choice is at most SFE(fi(A x •)///( A x fi) 
times n d \ A|, where A is the union of the square regions of R d on which F € is linear. If follows 
from Theorem 16. 1. H — since SFE(fi) = PBL(fi) — that limsup^^ ■^FE{y etn ) < /(/,//) + 
o(e), which implies the desired lower bound on probabilities. 
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Figure 7.1: Above, a mostly linear function F e , with dotted lines among the non-linear parts 
describing the bounds of a corresponding box measure), in the trivial case d — 1. Below, an 
instance of <fi sampled from n n y. in non-linear parts, <ft is chosen from a box measure; in the 
linear parts, <fi is chosen with empirical measure constraints. 

7.2.2 Upper bounds on probabilities when I(fi, f)<oo 

The upper bound follows almost immediately from Lemma fo.l.ll — in particular, the fact that 
SFE(fi) = FBL(fi). Since I(fi, /) < oo, yu is an ^-invariant gradient measure. Now, we may 
partition D into disjoint cubes K\, . . . , Kj, of equal size that cover at least a 1 — e fraction 
of the Lebesgue volume of D, and let K = Uf =1 fQ. In particular, for any basis set S, for a 
fine enough partition, we will have fi(K, ■)/ fi(K,Q); the liminf of the average specific free 
energy within K — as we choose S small enough so that u(K, •) lies in a sufficiently small 
neighborhood of fi(K, •) — is at least SFE(n(K, •)). As e tends to zero, fi(K, •) converges to 
fi(D, •) in the topology of local convergence. The average specific free energy outside of K 
is at least a (as defined in Corollary I2.3.2|) . This gives the desired bound. 

7.2.3 Upper bounds on probabilities when I(fi, f) = oo 

If /i is ^-invariant and S([i(x, •)) = Vf(x) as a distribution and nonetheless = oo 

(because SFE(fi(D, ■) = oo)), then the argument is the same as in the previous section. We 
have now to show the upper bound on probabilities when either S(fi(x, ■)) ^ V/(x) as a 
distribution or fi is not ^.-invariant. 

First, if /i is not ^-invariant, then there is an event H and a rectangle D' for which 
fi(D' x H) > fi(D x 6 X H) + 5 for some 5 > 0. We choose a neighborhood S to be such that 
for each (g, v) in this neighborhood, v[D' x H) > v{D x X H) + 5/2. Now, it is not hard 
to see that the probability of belonging to this neighborhood becomes zero when n is large 
enough. 

Second, suppose that there exists a rectangle D' for which S(fi(x, D')) is not equal to 
the mean value of Vf(x) on D' . Suppose these two values disagree in the ith component 
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direction — and suppose that D' has length L in the ith direction and normal cross-sectional 
area given by a. Then, integrating in the ith direction, this implies that the difference 5 of the 
mean value of / on opposite sides of D', divided by L, is different from fi(D', 4>(ei) — 0(0))— 
different by, say, 7 > 0. Now, let 

[>(*)- 0(0) [0(d) - 0(O)| < C7 
Hc{<t>) = \ -C 0( ei ) - 0(0) < -C 

(C 0(e,) - 0(0) > C 

By letting C get large and choosing a neighborhood that ensures that the average value of 
H c on D' tends to the S{n{x,D'))\ by continuity of the average cross- sectional area, in /, 
if we take S also to include a sufficiently tight restriction on g, then we can force the sum 
of 0(x + Ci) — 0(x) over an arbitrarily small fraction of the points x to grow in n like r yn d . 
Since V is super-linear, taking this fraction arbitrarily small implies, as desired, that 

inf liminf — logp n (S) < —00. 

S£A;xeS n d 

7.2.4 Exponential tightness 

We define the set K c C 7(D x Q) x L\ (D) to be the set of of profile/interpolation pairs 
corresponding to functions from D n to K with average V energy per edge equal to or less 
than C. Exponential tightness will follow once we show that 

1. For each C, the set Kc is pre-compact in (X, X). 

2. For any a > 0, if we choose C large enough, we will have 

liminf n~ d log ^ n {K r c ) < -a. 

n^oo 

For the first statement, it is enough to show that the projection of Kc onto each of its two 
components — in CP(D x fl) and in Lq (D) — is pre-compact in the corresponding topology. 
The first is a simple exercise; the second follows from Lemma f5. 2 .41 

For the second statement, by Lemma \A. 1.71 it is enough to prove an analogous statement 
using V instead of V. Suppose it were the case that for some a, 

liminf log /i n (l„-d| HoW |>(7) > -a 

for every C. Then this would imply that if we replaced $ with (1 — e)$, for any small e > 0, 
then we would have 

\immin- d \og J e- H ° {(t,) JJ =00, 

x€D n \{x } 

i.e., the log partition function growing super-exponentially in n. This is a contradiction to 
Corollary [Ql 
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7.3 LDP in other settings 



In this section, we briefly describe some variants Theorem 17.1.11 (to i? = Z or m > 1 or $ not 
an ISAP) and describe the modifications to the proof required for these settings. However, 
we will not reproduce in detail the proof of Theorem 17. 1 . II in each of these settings, as this 
would consume a good deal of space and the modifications are all straightforward. 

7.3.1 Discrete models and Lipschitz models 

All of the arguments in this chapter carry through to higher dimensional spin spaces, per- 
turbed systems, and discrete models (using similar rounding arguments to those described 
in the previous chapter) with little or no modification. However, some additional care is 
required in the case that E = Z but $ is Lipschitz, so that V fails to be everywhere finite. 
In this case, a does not approach infinity near the boundary of the space of allowable slopes, 
so it is possible that the rate function /(/,//) will be non-infinite even if Vf^(x) lies on this 
boundary for x in a subset of D with positive measure. If this is the case, we say that / is a 
taut height function. Since the variational principle and the uniqueness of the gradient phase 
of slope u (as shown in Chapter |SJ) may not apply when u is one of these boundary slopes, 
we cannot expect the minimizer of the rate function to be unique in this case. However, the 
large deviations principle does go through. A simple analytical argument shows that we can 
always approximate a taut (/, jj) by a sequence of pairs (fa, /ij) for which the fa are not taut; 
this enables us to deduce the necessary lower bound, and the upper bounds and exponential 
tightness arguments are the same as before. 

7.3.2 LDP with boundary conditions 

In the continuous setting, we sometimes wish to limit our attention to functions / that 
extend to the closure of D and satisfy / = f on dD, where f : dD — > M. m is a continuous 
boundary condition. Of course, elements of L v (D) are not continuous for general V , and 
in particular need not be continuous at the boundary of D. But we will say that a function 
/ on D has fo as its boundary if / is a limit in L v (D) of functions in L v (D), each of which 
agrees with f outside of a compact subset of dD. 

We would like to impose similar conditions on the discrete models. However, since none 
of the elements of the boundaries of the D n actually lies on the boundary of dD, we cannot 
simply require that (f>(x) = fo{x) for x G dD n . In fact, there are many ways to specify 
discrete boundary conditions; we will choose the one that is most convenient for us. 

Assume that fo extends continuously to a function in L 1,A (D); then we define as in the 
previous section the functions fo, approximating fo on D n , as in Lemma I5.4.4| and box 
measures v n e centered at these functions. Now, take \i n to be the sequence of measures 

/i n = z^wn 

xeD„\{x } ^Xnd-4>{x) where the Z n are normalizing constants chosen to 
make //„ probability measures and each Xn is the set of for which \4>{x) — fo(x)\ < e for 
each x G dD n . These induce measures p n , defined as above. 
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Theorem 7.3.1 If $ is a super-linear ISAP. The measures p n satisfy a large deviations 
principle with speed n d and rate function (up to an additive constant) given by 



' SFE(p(D, •)) — P($) p is L-invariant and S(p(x, •)) = Vf(x) 

as a distribution and f has f as 
its boundary 
oo otherwise 



on the space 7{D x Q) x L v (D) in the topology X described above. 

Proof We assume m = 1 (the extension to m > 1 being straightforward). Exponential 
tightness and the upper bounds on probabilities in the case that / has fo as its boundary are 
the same as in the no-boundary case. If / agrees with f outside of a compact subset of /, 
then lower bound argument is also exactly the same as before (noting that the approximation 
of / defined in Lemma 15.4.31 agrees with / — and hence fo — on the boundary of D n for all 
large enough n); since, by definition, any / that has fo as its boundary is a limit in L v (D) 
of functions with this property, this gives the lower bound in general. 

It remains only to check the upper bound on probabilities in the case that / does not have 
fo as its boundary (and hence !(//, /) = oo). We know by compactness that the probability 
that the probability that cf) fails to have a subsequential limit in L v (D) tends to zero super- 
exponentially. However, in light of the discrete boundary conditions, it is not hard to see 
that any subsequential limit in L v (D) of <fi chosen from p n must have fo as its boundary. 



The generalization to LSAPs with E = Z is straightforward when fo is not taut, since 
in this case, any / extending fo can be approximated by functions which are not taut. 
However, if fo is taut — for example, if it is a plane of slope u — then the empirical measure 
large deviations principle need not hold. If there are distinct ergodic Gibbs measures p\ and 
p 2 of slope u that have different specific free entropies, and 4>i an d 02 are samples from pi 
and fi2, then the large deviations behavior of the sequence of Gibbs measures with boundary 
conditions given by outside of A n will be different from the one with boundary conditions 
given by </> 2 outside of A n . 



7.3.3 Gravity and other external fields 

Let h : X — > K be any continuous function on X; we would now like to replace the measure 
p n with 7r n = e~ h p n (times the normalizing constant that makes Tr n is a probability measure). 
These new measures 7r n clearly satisfy upper and lower bounds on probabilities described 
above when I(x) is replaced by Iq(x) = I(x) + h(x). When m — 1, typical example of a con- 
tinuous function h might be (in the presence of boundary conditions) h(f, p) = f D f(r))dr)— 
this corresponds to weighting a configuration with additional energy proportional to the 
"gravitational potential" energy of the surface (causing typical surfaces to sag lower in the 
interior of D). Another example of such an h might be h(f,p) = p(H) for some function 
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H : — > {0, 1}; this corresponds to weighting by the number of times a particular local 
configuration appears. 

We would like to argue that for some constant C, the n n satisfy a large deviations principle 
with rate function I(x) + h(x) + C. But this follows immediately from Varadahn's Integral 
Lemma (Theorem 4.3.1 of [22J) provided that a certain tightness conditions holds. Namely, 
we require 



This holds for the gravitational potential described above (when a is super-linear) and many 
other kinds of external fields. 



lim lim sup n d log E ( e 




95 



Chapter 8 
Cluster swapping 



Cluster swapping is a simple geometric operation that we will use to prove strict convexity of 
the surface tension function a and to classify L-ergodic gradient Gibbs measures and their 
extremal decompositions (which may be non-trivial if E = Z) whenever <3> is an ^-invariant 
simply attractive potential. Throughout this chapter and Chapter El we will assume that 
m — 1, so that either E = Z or E = R, and that $ is an ^-invariant SAP. 

The prerequisites for this chapter are Chapters El El and|3J The only results from Chapters 
EJ El and [7| that we will even mention in either this chapter or Chapter El are Corollary 16.1.51 
(which gives the alternate definition of a using limits of log partition functions on tori) and 
the second half of the variational principle, Theorem 16.3.11 and the latter we mention only 
in the following remark. 

Recall that the variational principle has two parts: the first, which we will use frequently, 
is Theorem 12. 5. 21 which states that whenever SFE(fi) = <j(S(fi)), the measure /i is a gradient 
Gibbs measure. This result holds for all simply attractive potentials. Recall also that a 
minimal L-ergodic gradient phase of slope u is defined to be a slope-w L-ergodic gradient 
Gibbs measure /j, on (Q, J rT ) with minimal specific free energy; i.e., SFE(fi) = a(u). In this 
chapter, we will classify the minimal £-ergodic gradient phases of slope u for any u G U<$>. 

The second half of the variational principle, Theorem 16.3.11 states that every £-ergodic 
gradient phase [i of slope u G Uq> is in fact a minimal £-ergodic gradient phase. We have 
proved the second half only for perturbed isotropic and (discrete) Lipschitz simply attractive 
potentials — it is not known whether Theorem 16 . 3 . II can be extended to all simply attractive 
potentials. In the cases where Theorem 16.3.11 is true, our classification of the minimal L- 
ergodic gradient phases of slope u may be interpreted as a classification of all £-ergodic 
gradient phases of slope u. 
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8.1 Introduction to cluster swapping 



8.1.1 Review: Fortuin-Kasteleyn; Swendsen-Wang; 
Edwards-Sokal updates 

Before describing cluster swapping, we review some facts about the related Swendsen-Wang 
algorithm, introduced in 1987 jHl] and generalized to the form we present below by Edwards 
and Sokal in 1989 [30]. For this subsection only, we let (E, £) be a finite set endowed with 
counting measure A (for example, E could be { — 1, 1}, as in the Ising model), A any finite 
graph with a subset dA of its vertices designated as boundary vertices, and $ = {$a} 
any Gibbs potential on functions £ : A — > E. The crucial idea is the introduction of an 
independent random auxiliary function called the residual energy. 

Let (£, r) be a random pair in which £ : A — > E is sampled from the Gibbs measure 
e Ha Yl x eA\dA^( x ) (times a normalizing constant), with boundary values £ = £ fixed on 
dA, and r is an independent function from the subsets of A to [0, oo) where the values r(A) 
are all independent exponentials with parameter 1, i.e., distributed according to the measure 
e~ x dx on [0,oo). We refer to the quantity r(A) as the residual energy in A, $a(C) as the 
potential energy in A, and t(A) := r(A) + $a(6) as ^ ne total energy in A. Note that the 
probability density of the pair (£, r) with respect to the natural underlying measure (i.e., 
^|A\5A| times the product — over all A C A — of Lebesgue measure on [0, oo)) is proportional 
to e~ 1*1 where \t\ = £ AcA t(A). 

A general version of the random Swendsen-Wang update jHl] (as described by Edwards 
and Sokal [301) to the pair (£, r) is the following: first re-sample all of the residual energies 
r(A) for A C A from the marginal law of r (i.e., e~' r ' J^Iaca dr(A), where each dr(A) is 
Lebesgue measure on [0, oo)). Then re-sample the pair (£, r) conditioned on the total energy 
function t. 

If the latter step happens to be computationally easy (as it turns out to be for Ising 
and Potts models and spin glasses, see below), then one can often efficiently generate (ap- 
proximately) random samples from the Gibbs measure by beginning with a deterministic 
pair (£, r) and applying the Swendsen-Wang update repeatedly [3UJ |SI]. This method of 
Monte Carlo sampling, called the Swendsen-Wang algorithm, is the subject of a large liter- 
ature. (See also (55] for an exact sampling analog of Swendsen-Wang.) Although we will 
not discuss sampling problems in this paper, we will use related random updates to generate 
couplings and to prove other results. 

Remark: The Edwards-Sokal formulation is cosmetically different from ours. They first 
add additive constants to the functions <3>a if necessary so that they are all non-negative 
and replace what we call "total energy" t(A) with the quantity t'(A) = e~ t( - A \ They then 
study the joint law of (£,£') instead of (£, t) or (£, r). We will use the fact that conditioned 
on £, the law of t(A) is $a(C) plus an independent parameter one exponential. Edwards 
and Sokal use the fact that conditioned on £, the law of t'(A) is uniform on [0, e~* ^]. 
Edwards and Sokal also do not interpret t' as an "energy." For our purposes, it is more 
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natural to deal with r or t and interpret them as an energies, given the role they will later 
play in variational principles, S'FE'-preserving updates on infinite systems, etc. 

From here on, we will specialize to the case that $ is a nearest-neighbor pair potential. 
In this case, the values of r on sets A that are not endpoint pairs of edges of A are irrelevant 
to the way £ is updated in the Swendsen- Wang algorithm, since the total energy in such a 
set is r(A) independently of £. From here on, we will ignore these r(A), and think of r as a 
function on the edges of A only. 

The simplest and most well-studied example — and the one that will turn out to be most 
relevant to random surfaces — is the Ising model with coupling constants K e e K on the 
edges e of A, i.e, E = { — 1, 1} and $(x,y){£) — Ke£(%)£(y) for each edge e = (x,y). Then the 
potential energy K e £(x)£(y) of e takes on only the two values ±K e , and the total energy in 
e is given by t(e) = K e £(x)£(y) + r(e). The expression r(e) = t(e) — K e £(x)£(y) is always 
non-negative. If we fix t(e) and t(e) < \K e \, then this can only be the case if K e £(x)£(y) is 
negative — i.e., the edge energy is the lower of its two possible energies (in which case the 
edge is said to be satisfied by £). If t(e) > \K(e)\ then this will be the case for both possible 
values of the potential energy. Let S be the set of edges of A at which t(e) > \K(e)\. An 
edge is said to be open if it lies in the complement of §. 

The reader may verify the following: 

Lemma 8.1.1 Let £ be a random function on A with law given by the Ising model with cou- 
pling constants {K e }, with boundary conditions £o on dA. Let r be an independent product of 
parameter one exponentials. If we condition on the total energy function t (which determines 
§), then (any regular version of) the conditional law of £ satisfies the following almost surely: 

1. All open edges are a.s. satisfied by £. This fact and the boundary conditions uniquely 
determine £ on each open cluster that contains a vertex of dA. 

2. In each open cluster that does not contain a vertex of dA, there are exactly two ways, 
differing by a sign change, of defining £ on that cluster so that all the edges in the 
cluster are satisfied. 

3. The law of £ conditioned on t is given by tossing an independent fair coin to determine 
the sign of each open cluster of A that does not contain a boundary vertex. 

In other words, conditioned on t, there are 2 < - number of open clusters ) ways to choose a pair 
(£, r) with total energy t, and each of them is equally likely. (Note that isolated vertices — i.e., 
vertices all of whose edges are in § — are also clusters, so the coin toss applies to these sites 
as well.) In particular, all the information needed for determining the law of £ conditioned 
on t is contained in §. Note that § is the set of edges e that are either unsatisfied or are 
satisfied and have r(e) > 2|lf e |. Thus, conditioned on £, the law of S is given by a Bernoulli 
percolation on A, with an edge belonging to S with probability 1 if e is unsatisfied and with 
probability f°? K , e~ x dx = e -2 '^' if e is satisfied. The Swendsen- Wang update to £ described 
in [HI] is performed by first sampling the set of open edges (using Bernoulli percolation on 
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the satisfied clusters of £ with the probabilities given above) and then tossing a fair coin to 
decide the sign of £ on each open cluster that does not contain a boundary vertex. (The law 
of the set of open clusters of (£, r) — called Fortuin-Kasteleyn clusters — is also simple and 
was described by Fortuin and Kasteleyn in 1972 [31] ■ A similar analysis applies to Potts 
models.) 

8.1.2 Random surfaces and Ising models 

We now return to the main setting of this chapter: $ is an SAP, E = Z or R, and A CC Z d . 
Given a pair fa, fa G O of admissible functions (as defined in Section H. 1.2(1 . define a non- 
decreasingly-ordered-pair valued function 

£(x) = (mm{fa(x),fa(x)},m&x{fa(x),fa(x)}) 

and a { — 1, 0, l}-valued function ((x) = l^^^O) ~ lfa(x)«h(3>)- We will also sometimes 
interpret £(x) as representing the unordered set {fa(x),fa(x)} since it contains no informa- 
tion about which of the two values came from which function. Note that £%(x) refers to the 
smaller of fa(x) and fa (x) and ^{x) to the larger. In light of the following trivial result, we 
may think of the map (0i, fa) — > (£, C) as a measure-preserving change of coordinates. 

Lemma 8.1.2 The map E 2 — > E 2 x { — 1, 0, 1} that sends (rji, r] 2 ) to 

((mm{r] 1} 7] 2 },max{ri l} ri 2 }),l Vl>V2 - l Vl<V2 ) 

is injective. The A x A measure of any measurable subset of E x E is equal to the A x A- 
times- counting measure of its image under this map. 

We now make the connection between this setup and the previous section: 

Lemma 8.1.3 Let (pi, fa be an independent pair of random functions sampled from 7a(- , 0?) 
and 7a(",02) (where (pi,4>2 £ ^ are admissible functions that determine the boundary con- 
ditions outside of A). If (£, £) are constructed from (fa, fa) as above, then conditioned on 
zs almost surely the case that (any regular version of) the conditional law of ( in 
A\{x G A : fa(x) — fa (x)} is given by a ferromagnetic Ising model with coupling constants 
K( x ,y) < given by the potential edge energy $ e (0i) + & e {fa) when one of the fa is greater 
than or equal to the other at both endpoints minus the potential edge energy when this is not 
the case. Explicitly, 

K (x , y) = lv x , y (b(y)-Zi(x)) + v x , y (z 2 (y)-t2(x))} (8.i) 
- [v x , v {Uy) - 60*0) + v x , y (Uv) - ■ (8.2) 

(8.3) 

Proof If E = R, then {x G A : fa(x) = fa (x)} is almost surely empty. For each possible 
value of £ restricted to A, the map (fa, fa) — > £ has 2' A ' inverses, and the map is Lebesgue 
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measure preserving (in fact, affine) in a neighborhood of each of them. We conclude that 
the conditional law of each possible inverse (0i, 2 ) of £, given £, is proportional to its Gibbs 
weight e~ HA ^~ HA ^ 2 \ and this proves the lemma. A similar argument applies when E = Z. 
In this case, there are 2' A ^ xeA;<?il ( a; ) =9i2 ( x '^l possible inverses, and the probability of each of 
them is proportional to its Gibbs weight. The fact that the value of K e described above is 
always non-positive (and hence the model is ferromagnetic) follows immediately from the 
convexity of the V x>y and Lemma [4.1.51 I 

Remark: Lemma [8.1.31 also applies when the V XjV are non-convex, but the K e may be 
positive in this case, i.e., the corresponding Ising model may not be ferromagnetic. It also 
applies if we replace E = Z or E = R with another choice of E (e.g., a finite set) and define 
£ and ( in terms of an arbitrary ordering on E (which may not come with a canonical 
ordering, as Z and R do). 

8.1.3 Defining cluster swaps 

Now we will formally define some cluster swapping maps. Write 

where E is the set of functions from K d to [0, oo) and E d is the set of edges of the lattice Z d . 
Let jF be the product a-algebra on fl and let 3" be the a-algebra generated by J T x y x times 
the product a-algebra on E. Let n be the measure on E describing an independent product 
of parameter one exponentials (i.e., the Gibbs measure on E in which H\(r) = Yl X £A r ( x ) 
for each A CC E d and each r £ E). 

Given a triple (0i,0 2 ,r), we can define (£, C,i), where t is the total energy per edge, as 
defined in the previous section. First we observe that (4>i,4>2, r ) (£)Cj*) is a measure- 
preserving coordinate change: 

Lemma 8.1.4 The map (0i,0 2 ,r) — * (£, (,t) on any finite graph A is an injective, mea- 
sure preserving coordinate change. That is, the measure of a measurable set in the natural 
underlying measure on admissible ((pi,(p2, r ) configurations (i.e., A' A ' x A' A ' times Lebesgue 
measure on the product of [0, oo) over the edges of A) is equal to the measure of its image in 
the natural underlying measure on (£, £, t) configurations (i.e., (A 2 )' A ' times counting measure 
times Lebesgue measure on the product o/R over the edges of A). 

Proof The map (01,02)7") — > (£, £, r) is injective and measure preserving by Lemma [8. 1.21 
The map (£, £, r) — > (£, £, t) is injective and measure preserving because t — r is a continuous 
function of (£, ("). 

Combining Lemmas 18.1-H 18.1.31 and 18.1.41 gives the following: 

Lemma 8.1.5 Let (0i,</>2,r) £ Q be a random triplet with law 7a(",0i) ® 7a(",02) ® 
(^so 0?,02 ^ ^ determine the boundary conditions outside of A), and define (£, C>£) /rom 
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(01,02,^) a s above. Then conditioned on £ and the total energy t (which determine the K e 
and SJ, the conditional law of ( is as follows: throughout each component of the complement 
ofS containing a vertex outside of A, £ is equal to its value at that vertex. On each component 
of the complement of § that is strictly contained in A, r\ is a.s. constant, and the law of the 
values on these components is given by an independent fair coin toss on each component. 

We now define the maps that we will call cluster swaps. Let R\ : fl — > fl be the map 
such that i? A (0i, 2 , ?") = ((/)[, (j)' 2 , r') where — if (£,£,£) and (£', C'^') are defined from the 
triplets as above — we have £' = £, t' — t, and C' = C unless the vertices in the open cluster 
containing x (as defined by (0i, (j>2,r)) are all contained within A, in which case (' = —( on 
that cluster and (' = ( everywhere else. We write R x = R% d . 

Informally, R x h swaps values of 0i and 02 on the open cluster containing x (provided 
that cluster is contained in A) and then adjusts the residual energy in such a way that 
the total energy on each edge is unchanged. Clearly, R\ is an involution. When we use 
(£, C, t) coordinates, it is obvious that R x k preserves both the underlying measure and the 
Hamiltonian \t\. We conclude the following: 

Lemma 8.1.6 If 4>\ and 0° are admissible andJI = 7a(-|0?) <S>7a('|02) ® 7r » where A CC Z d 
and x e A, then ~jlR\ = 

We refer to S as the set of closed or swappable edges, meaning that there is enough total 
energy on the edge to make it possible to swap the values of and 02 at one endpoint of 
the edge and not the other while (after adjusting r) preserving the total energy on that edge. 
Edges in E d \§ are called open or unswappable. Observe in particular that whenever <pi and 
2 agree on one of the endpoints of an edge e, we have e G S. Thus, each of the points on 
which £ = is its own cluster of E d \S. 

8.1.4 Perfect matching example and uniqueness proof overview 

One simple setting for cluster swapping is domino tiling or perfect-mat ching-on-Z 2 model 
described in Section ll.H.ll We interpret cluster swaps in this setting and, as a preview of 
later sections, sketch the arguments that show the uniqueness of the gradient Gibbs measure 
of slope u G U$. 

Recall that there was a one-to-one correspondence between perfect matchings of Z 2 and 
finite energy height functions on the faces of Z 2 (defined up to additive constant), with 
respect to the appropriate potential. (Although the rest of our exposition assumes the 
height functions are defined on vertices, it will be simpler to visualize the correspondence 
in this section if we adopt the dual perspective and consider the functions to be defined on 
faces.) 

If 0! and 02 are two such height functions, corresponding to perfect matchings 7± and 
T2, then 0' := 02 — 0i is a function on the square faces in the Z 2 lattice with the following 
properties (see Figure l8".1.4jl : 
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1. If a; and y are adjacent squares and the edge between them lies in both or neither of 
Ti and T 2 , then 4>'(x) = (f)'(y). 



2. If x and y are adjacent squares and the edge between them lies in exactly one of Ti 
and T 2 , then \<j/(x) - (j)'(y)\ = 1. 

Let 7 be the set of edges that belong to exactly one of Ti and T 2 . Since every vertex is 
incident to exactly zero or two edges in 7, 7 is a disjoint union of finite-length cycles and 
infinite paths, which partition the squares of Z 2 into regions on which <j)' is constant. The 
value of <$' changes by ±1 as one crosses one of these cycles. Recall also that L is the set of 
elements of Z 2 such that translation by these elements preserves the standard "chessboard" 
coloring of the squares of Z 2 . The following simple proposition is illustrated in Figure IS. 1.41 

Proposition 8.1.7 In the domino tiling setting, § consists of the set of all edges such that 
<fi' — on at least one endpoint of that edge. The open clusters are the connected components 
of {x : <fi'(x) 7^ 0} and the boundary of each open non-boundary-intersecting cluster is a cycle 
of edges in 7. The cluster swap R x reverses the sign of <f>' on the component of {x : <fi'(x) ^ 0} 
containing zero and leaves 4>' unchanged elsewhere. 

A cluster swap in this context — as described in the previous section — amounts to swap- 
ping the edge sets of 7\ and T 2 that lie in the interior of one of the cycles. When one swaps 
the edge sets of 7\ and T 2 within a region, this does not alter 7, but it reverses whether 
the value of changes by 1 or —1 as one crosses each cycle in that region. We refer to the 
swapping of the edges within a single cycle of 7 (i.e., swapping which of the two alternating 
sets of edges in the cycle belongs to 7\ and which belongs to T 2 ) as reversing the orientation 
of the cycle. 

Since in this model, each V x ^{r]) is equal to or oo for all rj, the values of r are in fact 
irrelevant. That is, whenever R x (<f)i, 2 , r) = (-01, ip 2 , s), we have r = s, and the values of ipi 
and ip2 do not depend on the value of r. 

Now, Lemma 18.1.61 implies that conditioned on a finite cycle of 7 — separating a height 
zero region outside from a height ±1 region inside — and on all the edges of Ti and T 2 outside 
of that cycle, each of the two orientations of the cycle is equally probable. Applying the 
same argument using i? x (0 1 ,0 2 + c, r), where c G Z, we see that the same is true for all 
cycles, and from this it is not hard to prove the following: 

Lemma 8.1.8 Conditioned on the set 7, the orientations of the finite cycles of 7 have the 
law of independent fair coin tosses. 

Of course, this is essentially obvious even without cluster swapping, but the cluster 
swapping argument will be useful in more general settings. Using this lemma, the reader 
may be able to mentally prove the following (a more general version of which we prove later): 

Lemma 8.1.9 If Hi and /z 2 are distinct L-invariant Gibbs measures on perfect matchings 
and [i\ ® ^2-olmost surely the symmetric difference 7 of a pair (Ti, T 2 ) contains no infinite 
paths, then /ii = /i 2 . 
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Figure 8.1: Above: the edges of 7\ and T2 intersecting the grids shown. Bottom left: 
the edges of T, the symmetric difference of 7\ and T2, together with the height difference 
4>' = 0i — 02- The closed-edge set S consists of edges (dual to those shown) whose endpoints 
are squares at least one of which has height zero. The open clusters are the islands on which 
</)' is positive or negative. If A is the 8x8 collection of square faces shown, then there are 
four open clusters strictly contained in A. Bottom right: the height difference ip' = ip[ — ifj' 2 
and the corresponding tilings, where (ipi,ip2,s) = R x (<f>i,<fa,r) and x is the square with 
4>'(x) = 2. (The values of r and s are irrelevant in this model.) 
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The next few sections will use the variational principle and a variety of cluster swapping 
arguments to prove a version of the following which holds for SAPs in any dimension: 

Lemma 8.1.10 If fix and p<i are ergodic gradient Gibbs measures of the same slope u G U<&, 
then pi ® fi2 almost surely, the symmetric difference T of Ti and T2 contains at most one 
infinite path. 

Chapter El will then — in a much more general but strictly two-dimensional context — rule 
out the case of one infinite path: 

Lemma 8.1.11 In the setting of the previous lemma, 7 almost surely does not contain a 
single infinite path. 

The lemmas above will together imply the uniqueness of the ergodic Gibbs measure of 
slope u G We now return to the more general setting in which $ is any SAP. 

8.2 Monotonicity and log concavity 
8.2.1 Stochastic domination via cluster swapping 

The following "monotonicity" property is very well known for many systems with convex 
difference potentials; it is used, for example, in f|0J and JH] for Ginzburg-Landau and domino 
tiling models, respectively. Cluster swapping is one convenient way of proving this fact. 

Recall, first that if p and u are probability measures on an arbitrary measure space (X, X) 
and < is a partial ordering X, then we say that p -< v or v stochastically dominates p if 
there exists a measure p on X x X (with the product a-algebra) such that on the set of 
pairs (a, b) G X x X, it is p a.s. the case that a < b, and the first and second marginals of p 
are respectively p and v. When and ip are real or integer valued functions with common 
domains, we use the partial ordering < ip to mean that 0(x) < i/j(x) for all x in the domain. 

Lemma 8.2.1 Suppose that 05, 0° G are admissible and 0? < 02- Then for any A CC Z d , 
we have 7a(-|0?) -< 7a(-|0<D- 

Proof The measure 7a(-|0?) ® 7a( - |02) ® 71 011 triplets (0i,0 2 ,r) induces a corresponding 
measure on triplets (£, (, t). Clearly, ( > at all vertices outside of A. Since ( is a.s. constant 
on each component of the complement of § and either or 1 at all vertices outside of A, this 
implies that any open cluster that is not strictly contained in A has ( > 0. 

By Lemma [8.1.51 the sign of ( on each open cluster on which ( 7^ may be determined 
by an independent coin toss — in other words, on each open cluster the coin toss decides 
whether X = £1 and 2 = £2 o r 0i = £2 and 2 = £1. 

Suppose that instead, for each cluster we toss an independent fair coin and, depending 
on the outcome, either take 0i = 02 = £1 or 0i = 2 = £2- Clearly, this change does not 
affect the marginal distributions of 0i and 2 . However, it does guarantee that we will have 
0i — 02 at all vertices that are not on open clusters containing vertices outside of A. Since 
0i < 02 on such clusters, 0i < 2 throughout A. I 



104 



The following is immediate: 

Corollary 8.2.2 If <j>i(x) < (j^ix) < 4>\{pc) + c /or all vertices x G Z d \A which are adjacent 
to a vertex in A, then 

7a("|0?H7a(-|^H7a(-|0? + c). 

We will also use the following as a technical lemma. 

Corollary 8.2.3 Let <p c be the function which is equal to an admissible function <ft every- 
where except at one vertex x G Z d \A where it is equal to c; when c is in the interval for 
which C is admissible, let F(c) be the 7a(-|0 c ) -expected value of <p(y), where y G A. Then 
F(c) is monotone increasing and F(c2) — F(c\) < C2 — C\ for all c±,C2 G E. In particular, if 
c is chosen from a distribution v on E (supported on c for which <p c is admissible) , then the 
variance of F(c) is less than or equal to the variance of c. 

Proof The first two claims follow immediately from Lemma r8.2.11 All that remains to prove 
is that if c is chosen from v and F is monotone Lipschitz (i.e., F(c2) — F(c\) < ci — c\) then 
the v- variance of the random variable F(c) is less than or equal to that of c. Equivalently, 
if a\ and 02 are sampled independently from z/, we would like to show that the variance of 
F(ai) — F{a2) is less than or equal to that of 02 — a±. Since both variables have mean zero, 
and (-F(ai) — F(a2)) 2 < (cti — CL2) 2 for all a\, a 2 , the result follows. I 

8.2.2 Log concavity via cluster swapping 

A probability distribution on E is log concave if the log of its Radon-Nikodym derivative 
/ with respect to E is a concave function. (In particular, / is continuous on the interval 
on which it is finite). On Z, this is equivalent to the statement that / is continuous and 
2 log f(a) > log /(a + 1) + log /(a — 1) for all a G Z (where we write logO = —00), or 
equivalently f{a) 2 > f(a + l)f(a — 1) for all a G Z. Log concavity also implies that 
f (a) 2 > f(a + c)f(a - c) for all c G Z. 

If E — K. and / is assumed to be continuous, then the log concavity of / is equivalent to 
the statement that f{a) 2 > f(a — c)f(a + c) for all cel. (If / is continuous and fails to be 
convex, then it is easy to see that there is some arithmetic sequence along which it fails to 
be convex, and the discrete characterization above applies to that sequence.) 

Lemma 8.2.4 Suppose A CC Z d ; xq G A, and 0o G Q is admissible. If is a random 
function chosen from 7a(-|0o) , then the law of 4>(xq) is log concave. In fact, the same result 
holds if A is a finite subset of the vertices of any connected graph and $ a convex nearest 
neighbor potential on that graph, with admissible boundary conditions fixed outside of A. 

Proof When E — R, this follows from a variant of the Brunn-Minkowski inequality known 
as the Prekopa-Leindler inequality; see Theorem 4.2 of [12] for details. We present a simple 
argument that uses cluster swaps in the case E = Z. 
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Let 4>1 = 0o an d (j)® = (fio + c for some c G Z with c > 0, and sample (0 1 ,0 2 , r ) from 
7a('I0i) ® 7a('|$D ® Then conditioned on £(xq) = (a, a + c) (for some a G Z such that 
this occurs with positive probability) and t, the probability that 2 (#o) > ^1(^0) is 1/2 if 
the open cluster containing xq is contained in A and 1 otherwise, by Lemma 18.1.51 Thus, 
conditioned only on £,(xo) = (a, a + c) the probability that ^2(^0) > 0i(#o) is between 1/2 
and 1, which implies that f(a) 2 > f(a + c)f(a — c). 

A similar argument — using regular conditional probabilities — yields an alternate proof in 
the case E — R. The extension to general graphs is trivial. I 

8.2.3 Log concavity for extremal (non- gradient) Gibbs measures 

The log concavity arguments of Section 18.2.21 combined with Lemma 13.2.31 imply the fol- 
lowing: 

Lemma 8.2.5 If § is a simply attractive measure, and \x G exS(fi,3 r ), and x G Z d then the 
density of the height distribution of<p(x), for (p chosen from \x, is log concave. In particular, 
the random variable <f)(x) has finite mean, variance, and moments of all orders. 

Proof By Lemma l3.2.3| for \x almost all 0, the measures 7a„("|0) converge to /1 in the 
topology of local convergence, where the A n are cubes of side length (2n + 1), centered at 
the origin. Let be the law for i/j(x), when ip is chosen from 7a„(-|0); let v x be the law 

of if>(x) when ip is chosen from /i. Then the preceding statement implies that for \i almost all 
0, the measures v^ U}X converge to v x in the r-topology (i.e., the smallest topology in which 
v 1 — > v(A) is open for each Borel set A C E). The reader may check that the r-topology 
limit of a sequence of log concave distributions on R or Z is necessarily log concave, and 
the result then follows. The fact that a distribution v is log concave implies that the log 
probabilities must decrease at least linearly; thus, the tails of v decrease exponentially, and 
moments of all orders exist. I 

A similar argument yields a characterization of smooth gradient Gibbs measures: 

Lemma 8.2.6 If & is a simply attractive measure, and /1 G ex < ^{VL ) 3 rT ) , and x G Z d ; then jj, 
is a smooth gradient Gibbs measure (i.e., a restriction to J T of a Gibbs measure on (£1,3^)) 
if and only if, for fj, almost every 0, the measures v^^ n ,x converge to a non-zero limit in the 
r-topology. 

Proof The proof of Lemma f8. 2. 51 implies that v^^ n ,x almost surely has a limit whenever fi is 
smooth. (This also follows from Lemma 13.2.31 ) For the converse, let M n be the median of 
the measure v<f>, n ,x, and note that the value M = lim^oo M n converges for /i almost every 
and is tail measurable. So, we define JJ as follows: to sample from JI, first sample from [i 
(determined only up to additive constant) and then choose the constant in such a way that 
M = (when E = R) or M G [0, 1) (when E = Z). I 
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8.2.4 Existence of minimal gradient Gibbs measures of a given 
slope 

Log concavity also yields a simple proof of the existence of an ergodic L-invariant gradient 
Gibbs measure of a given slope u G U$ with cr(u) < oo. Lemma \4 .4 . 1 1 implies the existence of 
an £-ergodic gradient Gibbs measure slope u provided that u does not lie in an unbounded 
subset of W 1 on which a is linear. We will now strengthen that result to all u G U$. (Note: 
this is a preliminary result that we will use to prove that a is strictly convex. If we could 
prove strict convexity of a without this result, then this result would follow from Lemma 
14.4.11 ) As always in this chapter, we assume that m = 1 and $ is simply attractive. 

Lemma 8.2.7 There exists an ergodic gradient phase fi u G S£ on J T ) °f slope u for 
every u G U$. 

Proof First, fix k so that kZ d C £. By Lemma 14.2.61 some subsequence of the measures 
/z™ (defined in Section 14. 2 j) on functions on (nk) d tori converges in the topology of local 
convergence to an ^-invariant gradient Gibbs measure \i G t J ) c(Q,3 rT ). 

Now, using the notation of Section l4~2^ Lemma f8 . 2 . 41 implies that the probability density 
of 4> g (y) — 4>g( x ) induced by \i n is log concave for any x, y G 7L d . Also, if x — y G £, then this 
density has expectation equal to the inner product {^[u\,y — x) when E = Z and (-, y — x) 
if£ = R. 

For any Ci > and C 2 , let Sc lt c2,c 3 be the set of log concave probability densities on 
E which are laws for random variables Y for which the expectation of \Y\ is less than or 
equal to C\ and the expectation of Y is contained in [C2, C3]. The reader may easily verify 
(by deriving a uniform exponential bound on the decay of the tail probability densities) that 
the sets S , c 1 ,c 2 ,c 3 are compact in the r-topology. In particular, this implies that for all x 
and y in £, the \x probability density of the random variable <fi(y) — <j>(x) is log concave 
and fj,((j)(y) — 4>{x)) = (u,y — x). (One can see this by choosing Ci = (u,y — x — e) and 
C3 = (u, y — x + e) for arbitrarily small e.) 

By Lemma 13.2.51 we can write \i = f exgT vw^idv) for some probability measure u> M 
on the space of ergodic gradient Gibbs measures. By Theorem 13.3.21 is supported on 
the space of ergodic gradient Gibbs measures with finite specific free energy. By Lemma 
12.3.81 is also supported on the space of gradient Gibbs measures with finite slope, and 
S(/jl) = J ex gr S(u)Wfj,(du). We claim that the random variable S(u) — where v is sampled from 
— is equal to u with probability one, and hence u^-almost every v is an ergodic gradient 
Gibbs measure of slope u and finite specific free energy (as is needed for the lemma). 

To this end, we first observe that there is a uniform bound (independent of n) on the 
expected difference <f>(y) — 4>(x) for any neighboring x and y. Choose a subsequence of the 
n values along which lim — |T n | _1 log Z Tn converges to the value liminf^oo — |T n | _1 \ogZ Tn . 
The uniform bound on specific free energy implies a uniform bound C on the //^ expected 
values of \<j>{y) — <p{x)\ (as in Lemma f2.3.8|) which in turn implies a uniform bound on the 
variance of <p(y) — 4>(x) for any n and any pair of neighboring points x and y (as in Lemma 

MM- 
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Now, we use a martingale/monotonicity argument identical to the one in jTHj to show 
that if x and y are j units apart in T n , then the variance of 4>(y) — <j>(x) is bounded above by 
jC. First construct a path x = a , a 1; a 2 , . . . , a, = y. Add a constant to <fi so that 0(a o ) = 
and write Si = 0(aj) — 0(aj_i). Let J n be the smallest a algebra in which Si is measurable 
for i < n. 

Write b n = Ej n (0(y) — <p(x)) — E(<f)(y) — <p(x)), where E^ represents conditional expectation 
with respect to a o algebra A. Clearly, the sequence b n is a martingale. Writing h n = 
fe n _i + (b n - we have 

m 2 n = E + 26 n _x (&„ - 6^0 + (6 n - 6„_i) 2 ) = Eb 2 n _, + E(b n - b^f. 

Inducting on n gives the following standard fact about martingales: E6 2 = Y^=i ^(^« — ^-i) 2 - 
We will now derive bounds on the individual terms E(6j — &j-i) 2 

If 5i, . . . , are fixed, then we may view 6j as a function of <5j. By Corollary 18.2.31 fand 
its obvious analog on the torus), this function is Lipschitz with Lipschitz constant one, and 
the expected variance of &j (conditioned on Si, . . . , <5j_i) is less or equal to that of Si, i.e., 

Eb 2 - E (E/,.,^) 2 < E5 2 - E (Ej^Si) 2 . 

The left hand side is equal to Eb 2 — b 2 _ x = E(6j — &i-i) 2 , and the right hand side is less 
than or equal to ESf. Since these Si are finite, periodic functions of the edges in Z d they 
correspond to, summing over i yields that that the variance of (p(y) — <f>(x) is indeed bounded 
above by jC, where C = sup{E(0(xi) — 0(x 2 )) 2 : \x\ — x 2 \ = l;xi,x 2 G Z d }. 

Together with log concavity and the compactness of the set of log concave probability 
densities with given bounds on their variances and the expectations, this implies that (for 
each basis vector ej of Z d ), the \i probability that (fi(jei) — 4>{ e i) differs from its expected 
value by more than ej decays exponentially in j. If, with some u^-positive probability S, the 
ith component of the slope of an ergodic component of \i is greater than Uj + 2e, then the 
one-dimensional ergodic theorem implies that liminf^oo n{{4>{je.i) —4>{ e i) > ( u t + e )j}) > &i 
contradicting this exponential decay. I 

Lemma (4.2.61 and Corollary 16 . 1 . 51 now imply the following: 

Lemma 8.2.8 The ergodic measure fi u of slope u constructed above is a minimal gradient 
phase — i.e., it satisfies SFE(fi u ) = a(u). 

8.3 Measures on triplets 
8.3.1 Definitions 

The proofs of the main theorems in this chapter all rely on "infinite cluster swapping" maps, 
which we have yet to define. In this section, we define and make some simple observations 
about measures defined on the space fi = fix£7xEof infinite triplets. Let a-algebra 3^ 
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be the product a-algebra on fl and let 3 be the a-algebra generated by 5F T x 3* times the 
product topology on E. 

We think of $ as extending to this space by writing: 

$a(0i, 02, r) = $ A (0i) + $a(02) + 5^r(e), 

e 

where the latter sum is over all edges e which contain at least one vertex of A. We similarly 
extend H\ and the probability kernels to triplets. We defined these kernels in Section fl. 1.21 
to be 

1 t{A,<j>) = Z K (<j>)- 1 f n#(x)exp[-tf A (0)]U(0). 
J xeA 

We now write 

7 f(A (tuM) = z A (0 1 )- 1 z A (0 2 )- 1 

exp[—H\(<pi, 2 , r)]lyi(0i, 02, r), 

xeA igA e 

where again, the products over e are taken over edges with at least one vertex of A. (Note 
that we do not need the term Z A (r) _1 , since this value is identically one regardless of the 
size of A and the value of r on edges not intersecting A.) A Gibbs measure on (fl, 3) is 
a measure on (fl, 3) which is preserved by these kernels. A gradient Gibbs measures is 
defined accordingly, replacing 3 with 3 . Note that our definition implies that in any Gibbs 
measure or gradient Gibbs measure on (fl, 3), the random variables r(e) are independent of 
0i, 02, and independently identically distributed according to a parameter one exponential 
distribution on [0, oo). We will denote the latter measure on E by 7r throughout this chapter. 

We say that a (gradient) Gibbs measure on triplets is £ -invariant if it is invariant under 
the shifts 9 v ,v £ £ which move the three components ?/>i, ip2,r in tandem; it is L-ergodic if 
it is extremal in the set of ^-invariant measures and extremal if it is extremal in the set of 
Gibbs measures (gradient Gibbs measures) on (fl, 3) (resp., (fl, 3 )). 

We can also define Gibbs measures, free energy, and specific free energy as we did in 
Chapter [21 replacing if A by H\ : fl i— ► R defined by 

#a(0i, 02, r) = # A (0i) + F A (0 2 ) + 

e 

In particular, if // £ !P,c(A, 3 ), then we write 

SFE(fi) = lim lA^" 1 ^ (fi An , e - s A» A IA„-i] g, A |A„-i| g [ Q) o^lEn^ ? 

n— +oo \ / 

where the expressions A' An_1 ' are interpreted the same way as in Section 12.31 and H° A is 
defined analogously to /f A (i.e., it is the sum of the energy contributions from edges strictly 
contained in A), and E n is the set of edges with both endpoints in A n . 

We can also define the slope S(n) = (u, v) to be the two slopes of the marginal distribu- 
tions of /i. We write S a (n) = for the average slope of //. 



109 



8.3.2 Extremal decompositions of Gibbs measures on triplets 

The following simple fact will be frequently useful: 

Lemma 8.3.1 If a gradient measure fi on triplets is extremal, then its three marginals are 
also extremal. Also, any independent product of the form /ii £g> /x 2 <E> p where fix, H2, and p 
are extremal, is itself extremal. 

(We remark that not every extremal measure on triplets is an independent of its three 
components; for example, an extremal measure could have extremal marginals but have the 
first two components coupled in such a way that they are almost surely equal.) 

Proof If fi is extremal, then it is clear that its marginals must be extremal, since otherwise 
there would be a tail event (an event involving only one of the three components) with 
non-trivial /i probability. 

To prove that the product of extremal measures is extremal suppose otherwise, i.e., that 
that there exists a tail- measurable set A C J with < p,\ ® /i 2 <8> p(A) < 1. Then the 
conditional probability of A given the first component <\>\ is a tail measurable function of Q 
which is not /^-almost surely constant, a contradiction. I 

Note that the analogous result for ergodic gradient measures is false. To see that inde- 
pendent products of ergodic measures can fail to be ergodic, consider the following example: 
let d = 1 and 4>o(i) = i (mod 2), and let \i be the probability measure on Q that puts half 
its mass on O and half on 1 — <p . Clearly, /i is ergodic, as is its restriction to 5F . However, 
fj, ® ii is not ergodic on Q x Q, since {(<po, <po), (<fii, 4>x)} and {(0o, (pi), (<fii, 4>o)} are both 
shift-invariant events with probability |. 

8.3.3 First half of variational principle for triplets 

We will need the obvious analog of Theorem 12.5.21 its proof is identical to that of Theorem 

E52I 

Lemma 8.3.2 If p E T^Q,^ ) has minimal specific free energy among L-invariant mea- 
sures with slope (u,v), then /i is a gradient Gibbs measure. 

Lemma 8.3.3 The minimal specific free energy among measures of slope (u,v) is equal to 
c(u) + a(v) and is obtained by an independent product /j, u ® /j, v eg) v (as defined in Lemma 
\8.2. ?t where v is an independent product of parameter one exponentials). 

Proof It is obvious that SFE(fi) > SFE(fii) + SFE(fi 2 ) + SFE{y) where /xi, /i2, and v 
are the marginals (use Lemma [2.1.41 and take limits). Since the latter term is at most zero, 
the statement now follows from Lemma [8.2.81 I 
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8.4 Height offset variables 



Consider a measure fi G 7jn(Q,3 rT ) with finite specific free energy. A function /i : fl h 
R U {00} is called a height offset variable for /a if the following are true: 

1. h(4> + c) = h(4>) + c for all G and c E E. 

2. ft, is tail-measurable on f2. 

3. /i is /i- almost surely finite. 

4. If v G £, then, /i-almost surely, h{4>) = h(9 v (f>) + (u,v), where u is the slope of the 
ergodic component of fi from which was chosen, i.e, u = Sfa^). (Recall Lemma 

E33) 

Although a sampling from a gradient Gibbs measure is defined only up to additive constant, 
height offset variables, when they exist, provide a canonical way of choosing that additive 
constant. 

Our main motivating example is when h is the limit of the average value of on increas- 
ingly large cubes centered at the origin — and h could be defined to be infinity if no such 
limit exists; we will show in Section that if fi is a smooth minimal phase, then the h thus 
defined does satisfy the above criteria. 

Lemma 8.4.1 If fi is a gradient Gibbs measure and h is a height offset variable for fi, then 
H is smooth — i.e., fi is the restriction to J T of a Gibbs measure fi' on (Q, SF) 

Proof We define // as follows: to sample from //, first choose (defined up to additive 
constant) from /i, and then output — h(<f>) (if E = R) or — [h((p)\ (if E = Z). Since any 
function of — h(4>) can be written as an ^-measurable function of Q, this \J is well-defined. 
Since h is tail- measurable — and hence its value is almost surely unchanged by the transitions 
kernels 7a — it is now straightforward to check that // is a Gibbs measure on (Q, 9 r ). I 

If E — R, then //-almost surely, /i(0) = 0. If E — Z, then //-almost surely h(4>) G [0, 1). 
When h is given, we refer to the measure h(fi), a measure on [0, 1), as the height offset 
spectrum of \x. The £-ergodicity of /t now implies the following: 

Lemma 8.4.2 If E = Z and h is a height offset variable for an L-ergodic gradient measure 
fi of slope u, then the height offset spectrum h(fi) (the law of h(<f)) if is chosen from fi) is a 
measure on [0, 1) which is ergodic with respect to the maps x 1— > x + (u, y) (mod 1) for y G £. 
In particular, if one of the components of u is irrational, then h(fi) is uniformly distributed. 
Also, if fi is extremal — so that h{fi) is a point measure — then we must have u G L, where L 
is the dual lattice of L . 

We will discuss the existence of height offset variables and their spectra in more detail 
in Section l8~71 

Next, we will need some analogous definitions for £-invariant, finite specific free energy 
measures fi on (O, 3 ). In this context, a function h : Q 1— > E U {00} is called a height 
difference variable for fi if the following are true: 
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1. h(<pi + ci, 02 + c 2 , r) = 2 ) + c 2 - Ci for all (0i, 2 , r) G and Ci, c 2 G -E. 

2. /i is tail- measurable on Q. 

3. /i is /x-almost surely finite. 

4. If v G £, then, /j-almost surely, /i(0i,0 2 ,r) = h(9 v (f) 1 ,9 v (f)2,dv r )- 

(We will primarily use this definition primarily for measures /x almost all of whose ergodic 
components have the same slope; hence the last requirement in the definition does have need 
a term depending on slope like we have in the definition of a height offset variable.) Again, 
our motivating example is that h is the limit of the average difference between 2 — 0i on 
large cubes centered at the origin — if such a limit exists /i almost surely and satisfies the 
above criteria. Now, let be the smallest a-algebra in which, for any x,y G Z d and e G K d , 
the functions r(e), 0i(x) — 0i(y), 02(x) — 2 (w), and 0i(x)— 2 (w) are all measurable functions 
on the set Q; $ Q differs from 5F in that differences between 0i and 2 are 5F -measurable. 
Note the proper inclusions 5F C ? C J. We define Gibbs measures on (Q, H ) analogously 
to Gibbs measures on (fi, 5F ). 

Lemma 8.4.3 Let ll &e an L-invariant gradient Gibbs measure on and h a height 

difference variable for il. Then ll is the restriction to SF of L -invariant Gibbs measure fx on 
(JIjSFq). Moreover, ll is L-ergodic if and only ifJZ is L-ergodic. 

Proof We define Jl as follows: to sample from Jl, first choose (0i,0 2 ,r) (defined up to 
additive constant for each of 0i and 02) from /i, and then output (0i + /i(0i), 02, r), 02, r) 
(defined up to a single additive constant c for both 0i and 02). Since h is tail- measurable — 
and hence its value is almost surely unchanged by the transitions kernels 7a — it is now 
straightforward to check that Jl is a Gibbs measure on (fi,5F ). Since h is ^-invariant and 
5F -measurable, any iL-invariant, 3^ -measurable function of (0i + h(<f>i), 02, r) can be written 
as an XL-invariant 3 -measurable function of (0i,0 2 ,r) — and vice versa. It follows that ii is 
ergodic if and only if Jl is ergodic. I 

Lemma 8.4.4 If fj, is a gradient Gibbs measure on (Q, 3^) and h is a height difference 
variable for ii, then both of the first two marginals Li\ and /i 2 of il are smooth. 

Proof Define a measure Jl on (ft, as follows: to sample from Jl, first choose (0i,02,t) 
(defined up to additive constants for 0i and 2 ) from il. Then pick the additive constant 
for 02 in such a way that 02(0) = and the additive constant for 0i in such a way that 
h(4>i,(j)2,r) = 0. Although Jl is not a Gibbs measure on its first marginal is a Gibbs 

measure on (Q,5F). This follows from the fact that for any A CC Z d , we have ii = /17A 
(where 7a is interpreted as a transition kernel on the first coordinate of Q only), and that 
applying such a transition kernel to the first coordinate of (0i,02,r) almost surely does not 
change either 02 or the h(4>i, 02, r). A similar argument holds for the second marginal of /i. 
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8.5 Infinite cluster classifications 



Given (0x, 2 , r) chosen from a Gibbs measure on (fl, 5F), what can we say about the infinite 
clusters of E d \§? How many such clusters are there? How do the clusters change if one adds 
a constant to 0i or 02? In this section, we will explore these and similar questions for Gibbs 
measures and gradient Gibbs measures on triplets. 

8.5.1 More definitions 

For any T C Z d , we write the following: 

1. T is sparse if limsup^,^ = 0. (Throughout this chapter, we assume that the 
cubes A n are centered at the origin.) 

2. If lim^oo ^ n j^ = ot for some a > 0, then we say that T is a-dense or has density a. 

3. An island of T is a finite component of the complement of T in 7L d . 

4. T is the union of T and all of the islands of T. 

We will also apply the first two definitions to subsets of the edges of Z d . If T is chosen from 
an XL-invariant measure on the space of subsets of Z d , then the ergodic theorem implies that, 
with probability one, it will almost surely either be empty or have some positive density a. 
Given (0 1 ,0 2 , r ) £ ^, we define the following variables (whose dependence on (0i,0 2 ,r) will 
always be understood): 

1. S c is the set of all edges for which (<pi + c, 02, r) is swappable. (Note: § c is not Im- 
measurable, since it depends on the arbitrary constants used to define 0i and 02-) 

2. T c + is the union of all vertices v in infinite clusters of E d \S c for which 02(f) > 0i(f) +c 
throughout the cluster. Similarly, T~ contains the vertices v in infinite clusters of 
E d \§ c for which 02(f) < 01(f) + c throughout the cluster. (Note: if 0i(f ) + c = 02(f), 
then every edge incident to v is contained in § c . Also, observe that, given 0i and 02, 
T c + is decreasing in c and T~ is increasing in c.) 

3. B + = inf{c : T c + is empty} We say B + = oo if no such c exists, i.e., if T c + fails to be 
empty for any finite c. B~ is defined similarly: B~ = sup{c : T c + is empty}. In this 
case, B~ = — oo if no such c exists. 

For a concrete example, the reader may check that in the perfect matching example 
described earlier, an edge is in S c if and only if has an endpoint x that satisfies 02 (x) — 0i (x) = 
c. Also, T c + is the set of points in or surrounded by infinite clusters on which 02 — 0i > c and 
T~ the set of points in or surrounded by infinite clusters on which 02 — 0i < c. Moreover, B~ 
and B + are simply the largest and smallest values of c for which the level set (0 2 — 0i) _1 (c) 
has an infinite cluster. In this setting, B~ = B + if and only if the union of the corresponding 
perfect matchings contains no infinite paths. 

The following is now a clear consequence of the above definitions: 
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Lemma 8.5.1 The set {c : T 6 + = 0} is equal to the interval [B + ,oo) if E = Z and either 
(B + , oo) or [B + , oo) if E = R. Similarly, {c : T~ = 0} is equal to (— oo, £>~] if E = Z and 
either (—oo,B~) or (— oo,5~] if E = 1SL. Note that if c G (B + ,B~), then neither T~ nor 
T c + zs empty. 

Also, although B~ and B + are not 3" -measurable, the following events are tail events in 

T: 

1. {(</>i, 02, r) : B + = oo} or {(0i,0 2 , r) : £> + = — oo} (or similar sets produced by 
replacing B + with B~) 

2. {(0i, <p2i r ) '■ B + and B~ are both finite and B + — B~ G 3} (where 2 is a Borel subset 
ofE) 

If an L-ergodic gradient Gibbs measure fj, on (fl, 3 rT ) admits a height difference variable 
/i (as defined in Section IH3J), then the following follows from the definitions: 

Lemma 8.5.2 The values B~ — h, and B + — h are both tail-measurable, 3 -measurable, 
L -invariant functions ofQ; if fi is L-ergodic, then they are both /i almost surely constant. 

8.5.2 Coupling extremal smooth Gibbs measures 

Our first application of the above definitions to the comparison of Gibbs measures is the 
following lemma: 

Lemma 8.5.3 Suppose that \i\ and \ii are extremal (non- gradient) Gibbs measures on Q. 
Then there exist values Bq and Bq such that fix eg) [1% <8) ir-almost surely, we have B + = Bq 
and B~ = B . Moreover, ^ + Bq -< -< l^i + Bq . In particular, Bq < Bq with equality 
if and only if, up to additive constant, jjLi — fJ>2 (i-e., the restrictions of \i\ and \ii to 3^ are 
equivalent) . 

Proof The first statement simply follows from the fact that B + and B~ are tail measurable 
functions. To prove the stochastic domination, we construct a coupling explicitly. Suppose 
that T c + is empty; that is, there is no infinite cluster in E d \S c on which 02 > 0i + c. Fix k. 
Then given a large box A n , we let be the set of vertices in A n that are not connected to 
<9A n by paths in E d \S c . These vertices are "isolated" from the boundary dA n . 

Now, consider the swapping map on triplets that swaps at all vertices inside of A^ and 
fixes all vertices outside of A^. As in the proof of Lemma l8.2.1| we can use this measure- 
preserving involution to define a coupling v n G T(f2 x OJx J) of and ^ to sample 
from the u n , first choose (0i, 02, r) from /ii <g> /i 2 <S> 7r. Then modify 0i and 02 in a way that 
replaces both of the values 0i + c and 02 inside of A^ with either the values of 0i + c (with 
probability 1/2) or the values of 2 (with probability 1/2), and output the modified pair 

(01,02)- 

Now, fix a smaller box A^, centered inside of A n ; as n tends to oo, the probability that 
a pair produced coupling satisfies 2 > 0i + c at some point in A k is bounded above by 
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the fix (g) Hi ® 7T probability that there exists a path in E d \S c from to <9A n — along which 
02 > 0i + c - This probability tends to zero as n — > oo. Now we claim that for /i x <gs // 2 ® vr- 
almost all triplets, there is a subsequential limit (in the topology of local convergence) of 
these couplings v n which is a coupling v of Hi and /i2 in which, v almost surely, 02 > 0i + c; 
hence /ii + c -< /i2- (That the marginals have limits follows from Lemma 13.2.31 and this 
implies the necessary tightness to ensure existence of a subsequential limit.) The first half 
of stochastic domination statement in the lemma now follows by taking c = Bq (if E — Z) 
or by taking limits of v defined by taking c = q where the Cj converge to i?^" from below (if 
E = M). I 

Note that if both T c + and T~ are non-empty for all values of c G R (as might occur, 
for example, when samples from Hi and H2 approximate planes of different slopes), then 
Bq = — oo and Bq = oo and the above lemma gives us no information. The lemma also 
implies that in the setting of Lemma \S. 5. 31 we cannot have either Bq = oo or Bq = — oo (as 
would occur if either T~ or T£ were empty for all values of c 6 K); the former would imply 
Hi + oo -< /i 2 (or, precisely, /ii + c -< /i 2 for all c e R), which is impossible. The latter gives 
a similar contradiction. The following two lemmas are simple consequences of Lemma 18.5.31 

Lemma 8.5.4 // [ii CLnd /i 2 are distinct gradient phases, then Hi ® A*2 ® n-almost surely, 
B~ < B+. 

Proof By Lemma I8.5.3[ it is enough to observe that u> m and w^ 2 are mutually singular, 
which follows from Lemma f3. 2. 31 I 

Lemma 8.5.5 If fi is a non-extremal gradient phase, then with /i®/i®7r positive probability, 
B-<B + . 

Proof By Lemma f8.5.3[ it is enough to observe that when (i/i, vq) are chosen from x w^, 
there is a positive probability that vi ^ v%. I 

(Since in the perfect matching model, B~ < B + if and only if the union of the two 
matchings contains an infinite path, we may view Lemma |H331 as a generalization of Lemma 

EH) 

8.5.3 Uniqueness of infinite clusters 

Lemma 8.5.6 Let fi be an SL-ergodic gradient Gibbs measure on (Q, 3" ) with slope {u,v), 
where u,v G suppose that h is a height difference variable for fi- Then there exists no 
c G R for which, with fi-positive probability, either T^_ h or T~_ h consists of more than one 
infinite component. 

Proof The number of infinite clusters in T^_ h is a ^-invariant, tail measurable, and £F - 
measurable event. As such, it is almost surely constant. Suppose that the number of infinite 
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clusters is almost surely k for some 1 < k < oo, and that the triplet (0i,0 2 ,r) is sampled 
from //. 

Let P be a path connecting points in two distinct infinite clusters of T^_ h . Observe 
that the set T^_ h only increases in size if we increase the value of 2 at any finite set of 
points. For each edge e = (x, y) in P, there exists some value a e such that if 2 is modified 
so that 02 (^) > a and 02 (y) > a, and r and 0i are left unchanged, then we cannot have 
(x, y) G S c -h- By Lemma 14.3.51 it is thus almost surely possible to alter 2 in a finite number 
of places — keeping the energy finite — in a way that connects two of the clusters of T^_ h . It 
follows that for some n, there are not 7a„(-|0i, 02, r)-almost surely exactly k distinct clusters 
of T^_ h . Since this is /z almost surely the case for (0i,0 2 ,r) and some n, there must exist 
an n for which this is the case with positive /i probability. But then it cannot be true that 
there are /i7A n -almost surely k infinite clusters of T^_ h , so this is a contradiction. 

Now, it remains only to rule out the case of infinitely many clusters. The following 
argument is due to Burton and Keane (see JT] or jST]). By similar arguments to the above, 
we see that for some e and A n , there is a fi finite probability that applying the transition 
kernel A n to a configuration has an e probability of joining three or more infinite cycles 
together. By the same token, there is a finite probability that applying the transition kernel 
A n breaks a single infinite cluster into three or more infinite clusters. Tile all of Z d with 
boxes of size A n . A given box is called a trifurcation box if removing a connected cluster of 
vertices inside of the box causes a single infinite cluster to break into three or more pieces. 

Now, let Y be any finite set with \Y\ > 3. A 3-partition of Y is a partition {Pi, P 2 , P3} 
of Y with exactly three non-empty sets Pi, P 2 , and P3. Two 3-partitions {Pi,P 2 ,P3} and 
{Qi, Q2, Q3} are compatible if there is an ordering of their elements such that Pi D Q 2 U Q 3 
(or, equivalently, such Qi D P 2 U P3). We cite the following fact from Burton and Keane 
(Lemma 8.5 of |43j): If 7 is a family of distinct 3-partitions of Y such that each pair of 
elements in V is compatible, then | CP | < \Y\ — 2. 

Now, for a large value of k, let A& n be a box of side length kn\ observe that for each 
trifurcation box of an infinite cluster C, we can choose a partition of the dA n n C into three 
sets, each of which is the intersection of dA n with one of the three components of the infinite 
cluster that is broken apart. In fact, as Burton and Keane observe (again, see Lemma 8.5 of 
|43j or JI]), the set of partitions corresponding to the trifurcation points of the intersection 
with any particular cluster forms a compatible family of partitions of the intersection of dA n 
with that infinite cluster. This implies that the total number of trifurcation points in A n is 
less than |9A n |. Since the expected number of such points must grow linearly in |A n |, this is 
a contradiction. I 

8.6 Gradient phase uniqueness and a strict convexity 

8.6.1 Statement of main uniqueness and convexity results 

In order to state the main results of this section, we will need the following definition. 
If E = Z, then we say that a pair of ergodic gradient phases \i\ and /i 2 on (Q, 3 rT )are 



116 



quasiequivalent to one another if the following are true: 

1. Each fii is a smooth phase; i.e., it is a restriction to 3* of a Gibbs measure fx\ on (fl, £F). 

2. fx\ ® // 2 ® 7r-almost surely, we have 5 + — 5~ G {0, 1}. 

3. S(fx\) = S(fx 2 ) (a simple consequence of the previous item when each //, has a well- 
defined slope). 

Note that by Lemma I8.5.3| the second item implies that a pair of extremely components 
(z/i,z/ 2 ) sampled from ® u> M2 almost surely satisfies (if additive constants are chosen 
correctly) V\ -< v% -< v% + 1. 

The following theorem is central to this section: 

Theorem 8.6.1 Suppose that fx is a measure on (fl, 3^) whose first two marginals fx\ and 
fx 2 are minimal L-ergodic gradient phases on (O, 3 rT ) with slopes in Uq>. Suppose further 
that with fx positive probability, we have B + > B_ (when E = M.) or B + > B~ + 1 (when 
E = Then for some appropriately defined "infinite cluster swapping map" R, we have 
SFE(R(fx)) < SFE(fx) and S a (fx) = S a (R(fx)); moreover, R(fi) is an L-invariant gradient 
measure on (fl, 3 r ) which is not a gradient Gibbs measure. 

From this theorem, we can immediately deduce surface tension strict convexity, and 
ergodic gradient Gibbs phase uniqueness as corollaries: 

Theorem 8.6.2 The surface tension a is strictly convex in U$. 

Proof Pick distinct slopes u\ and U2 in U$. By Lemma l8.2.7[ there exist £>-ergodic Gibbs 
measures fx\ and fi 2 of slopes U\ and U\ and SFE(fXi) = c(ui), SFE(fx 2 ) = cr(tt 2 ). Write 
fx = fXi®fX2®n. From Lemma 15. 5. 31 it cannot be the case that, //-almost surely, B + < B_ + l 
(since this would imply that the conclusion of Lemma f8 . 5 . 31 applies to measures of different 
slopes). Thus, fx satisfies the requirements of Theorem 18.6.11 Let (vi,v 2 ) be the slope of 
R(fxi <S> fi2 ® tt)- Since R(fx) is not a gradient Gibbs measure, by Lemma 15.3.21 and Lemma 
18.3.31 the convexity of a, and the fact that S a (fx) = S a (R(fi)), 

a( Ul ) + a(u 2 ) > SFE(R(fi 1 <g> fx 2 ® tt)) > a(v x ) + a(v 2 ) > 2a(^^) = 2o -( m + ^ ). 

Since a is already known to be convex, it easily follows from this that a is strictly convex 
on [/$. I 

Although a is also convex on the boundary of U, it is not necessarily strictly convex there. 
The surface tension function for domino tilings described in for example, is constant 
on the boundary of U. Using Theorem 18.6.11 we can deduce another key result. In light of 
Lemma 18.5.61 we may view the E — Z part of this statement as a generalization of Lemma 

i8~mn 
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Theorem 8.6.3 If E = R ; then for every u G U<$>, there exists a unique minimal gradient 
phase fi u of slope u. If E = Z, and fi\ and fi 2 are distinct minimal gradient phases of slope 
u, then Hi and fi2 are quasi- equivalent. 

Proof By Lemma 18.2.71 there exists at least one minimal gradient phase of slope u. Now, 
suppose that and Hi are distinct minimal gradient phases of slope u which are not quasi- 
equivalent; then h = fii ® [12 ® 7T satisfies the requirements of Theorem 18.6.11 Let (t>i,t>2) 
be the slope of R{\i\ ® ^2®^)- Since R(}i\ ® H2 ® it) is not a gradient Gibbs measure, by 
Lemma IB .3. 21 and Lemma IB. 3. 31 the convexity of a, and the fact that 5 a (/x) = S a (R(/J,)), we 
have the following contradiction: 

2a(u) > SFE{R{m ®H2® tt)) > a[v x ) + a(v 2 ) > 2a( ^ 1 = 2a{u). 



The remainder of this section is devoted to the proof of Theorem 18 . fi . 1 1 and the definition 
of the cluster swapping map R required by the theorem. We will obtain R(n) from \i by 
either a "single infinite swap" or an "infinitely repeated cluster swap" as fi respectively does 
or does not admit a height difference variable h. 



8.6.2 Single infinite cluster swap 

Suppose that h is a height difference variable for /1; write Bq = B~ — h and Bq = B + — h. 
We may assume that either B~ < B + with positive probability and E = K or B~ < B + — 1 
with positive probability and E = Z. In particular, there exists ace£ such that with 
positive \i probability, B^ < c < Bq, i.e., B~ < c + h < B + . Here, both T^ +h and T~ +h are 
nonempty. 

For any c, we define R c {h) to be the measure obtained as follows: to sample from R c (n), 
first sample (<pi, (f>2, r) from \i. Then add c+h to 0i, do a cluster swap that swaps everything 
outside of the new T ~ to get a new triple, and then subtract c + h from first coordinate of 
the new triple. To say this in precise terms, we define a map R c : Q 1— ► Q (defined //-almost 
surely) by R c ((f>i, (j) 2 ,r) = (ipi,ip 2 , s) where 



ij)i(x) 




x e T c+h 

c — h otherwise 



-02 (x) 




c+h 

otherwise 



and for each e G E d , define s(e) in such a way that <p 2 , r) and (-^l, 1P2, s) have the same 
total energy at the edge e. 

Now, if E = Z, then we write = R c {fi)i where c is some integer with the property 
that T^ +h and T~ +h are both non-empty with positive probability (i.e., B < c < B + 0). 
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If E = R, then (in order to simplify a free energy computation) we will instead write 
R(V>) = 7^ Ici R c(^)dc, where B < c x < c 2 < Bq . 

Clearly, R(/a) is ^-invariant. Now, we need to argue that SFE(R(fi)) < SFE(fi), i.e., 
that Hindoo |A n | _1 FE An (R(/j,)) < \A n \~ 1 FEA n (fi). Recall the definition in this context: 

FE A M = I fix, e- H °^^ J] d[Mx) - MxoMMx) - <hM] U dr ^ ) ■ 

\ a;GAn\{a;o} e / 

Now, if we could show that R(/j)\ n is the image of fi\ n under an injective map R (from 
the space of configurations on A n to itself) which preserves the above measure, then the 
equivalence of FE\ n (fj,) and FE\ n (R(fj,)) would be obvious. However, such a map cannot 
quite be well-defined: to determine the gradient values of the reflection map R c ((f>i, 2 , r)— 
restricted to a set A n — it is not quite enough to know (j>i(x) — <f>i(xo), 02 (x) — 02 (^o) and 
r(e) for all vertices and edges in A n . It is also necessary to know the difference between 
the additive constants of 0i and 02 (i-e., to know the value 02 (^o) — 0iOeo) — c — h at some 
reference vertex x G A n ) and to know which of the values x G <9A n are members of T~. But 
if we expand our definition of "configuration on A n " to include this additional information, 
then we can make the map well-defined. 

To this end, we write F(x) = if x G T~ h , and F(x) = 1 otherwise. Given A n , we 
will let Fg\ n be the restriction of F to the boundary of A n . Now, let // A be the law of the 
five-tuple (0i - 0i (x ), V02 - 2 (^o),^,i 7 aA ?l ,co = 02(^o) - 0i(^o) - c - h), where x is a 
reference vertex defined on the boundary of A n , the 0j's are defined on A n \{a;o}, r is defined 
on all edges within A n , and Co G M. 

Now, we can think of R as a measure preserving map on the space of five-tuples in the 
following way. Given (0i — 0i(x o ), 02 — 2 (^o)> r, Fqa u , c ), we can compute a new five-tuple 
R{4>i ~ 0i(^o),02 - (p2(x ),r,F 9An ,c ) = (il>i,i/>2,s,F dAn ,Co) as follows. Define the cluster 
C to be the set of all vertices v for which there is a P v from v to a vertex x v G dA n such 
that F(x v ) = and no edge of P v is swappable in (0i,02,t) (where the relative additive 
constants of 0i and 02 are chosen in such a way that c + h = — i.e., 02(^o) — 0i(^o) — c o)- 
The set C is then simply A n nT c _ . Now determine (-01, -02, s) by fixing the values of (0i, 02, r) 
inside of C, swapping the values outside of C, and adjusting s so that the map is energy 
preserving on each edge; leave the values of Fg\ n and c unchanged. (Note: the value Fg An 
is always defined based on the infinite clusters of the original triplet (0i,02,r); the value of 
Fd\ n after swapping should not be interpreted as referring to infinite clusters in an infinite 
post-swapping configuration. We include the same Fg\ n in both the pre-swap and post-swap 
five-tuples because doing so allows us to make the swapping map invertible.) 

Now, define FE An (fjf) to be the relative entropy of // with respect to v x ® v 2 <8> ^3 where 

y\ = e~' ff ^(0i,02,r) Yl d[(/>i(x) - h(x )}d[(f)2(x) - fa(x )] ] J dr(e), 

xeA n \{x } eeA n 

where each d[(f>i(x) — <f>i(xo)] and each dr(e) is Lebesgue measure (and we write e G A n 
when both endpoints of e are in A n ); v 2 = dco is Lebesgue measure; and u 3 = dF(x) is 
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counting measure. We can now use arguments similar to those given in Section 18.11 to see 
that the swapping map on five-tuples described above is invertible and measure preserving 
with respect to the measure V\ ® v 2 <8> ^3. (It is enough to observe that each R c is invertible, 
well-defined, and V\ ® z/3-measure preserving on each of the regions Xp 0) c on which F = Fq 
and C = C .) It follows that FE An (//) = F£ An (#(//)). 

By Lemma 12.1.31 FE\ n (j/) is equal to FE An (p) plus the the \i expectation of the relative 
entropy of c (with respect to Lebesgue measure) given (0 1; <p 2 , /■*), plus the expectation of the 
relative entropy of F (with respect to counting measure) given the four-tuple (0i, 2 , r, c ). 

Now, let n't' 1 >'t> 2 > r be the regular conditional probability for c given ^ — (f>i(x ), 2 — 02 (#0)) 
and r. Similarly, let fi't >1 > < t >2 > r > C0 be the regular conditional probability for F given the four-tuple 
(0i) 02) r i Co)- Now, we can phrase Lemma f2. 1.31 as follows: 

FE An (^>) = FE A M + vXtjj*'**', u 2 ) + v"K{^ 2 ' r ' c \ ^3). 

Note that the conditional distribution of F with respect to counting measure is bounded 
between between — \dA n \ log(3) and 0. Also, since c is chosen uniformly in an interval of 
length (c 2 — c\) independently of 0i, 02, r, it is clear that the second term on the righthand 
side is at most — log(c2 — ci) (or zero if E = Z). Thus, 

FE An (R(fi')) = FE An (//) < FE A M + o(|A n |). 

Next, using similar notation to the above for R(fi) and R(fi') instead of \x and /x', 

FE An (R{fi')) = FE An (R(fi)) + R{^)'K{R{^^ r \v 2 ) + R(^(R(^^ c °\iy 3 ). 

This time we need a lower bound on the second term; informally, we must show that we 
do not expect the distribution of Cq, given 0i and 02 and r, to be too "spread out." The 
easiest way to do this is to slightly alter our definition of cq. Recall that we define Co by 
c o = 02(^0) — 0i (^o) — c — /i. We then determined the clusters on which swapping occurred by 
looking at edges at which (0i — 0i(a?o), 02 — 02(^0) + c + 0, r) is swappable. For given values 
of the triplet (0i — <p(x ), 2 — <p 2 (x ),r), let bi and b 2 be the lower and upper bounds on the 
set of choices of Co for which (0i — 0i(xo), 02 — 02 (2^0) + co, r) has any swappable edges on A n ; 
clearly, if Co lies outside of the interval [61,62]) F will be constant on A n and the swapping 
map will either fix all of A n or swap all of A n . Thus, if 2 (xq) — 0i(xo) — h £ [61 + ci, b 2 + c 2 ], 
then there will be no swappable edges regardless of how c is chosen in (c 1; c 2 ). Our new way 
of choosing c will be as follows: first let 

{b 1 + c 1 02(^0) - 0i (^0) - h < h + ci 

6 2 + C 2 02(^o) - 01 (^0) - > 62 + C 2 

02(^o) — 0i (^0) — h otherwise 

then, as before, choose c uniformly in [ci, c 2 ] and write Co = B — c when F = R, and simply 
Co = B when E = 7L. Observe that the above definitions and arguments above remain valid 
with this new definition of cq. Using the fact that \x has finite specific free energy, it is 
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not hard to show that the expected value of b 2 — &i is 0(|A n |). Now, since the expected 
length of the interval on which i?(/i)^ 1,9i2 ' r is supported is 0(A n ), and the minimal relative 
entropy with respect to an interval of length k is — log k, Jensen's inequality implies that 
R(jj,)H(R(ji)*>*>> r \uz) > -0(log|A n )). 
Thus, we have 

FE An (R(fi)) < FE An W)) + o(\A n \) < FE A M + o(|A n |), 

and 

SFE(fi) = lim IA^FEaM < lim \A n \~ 1 FE Af ^(R(fji)) = SFE(R(fi)). 

n— «x) n^oo 

Finally, note that both T~ +h and h become infinite clusters after the swap; thus, it 
follows from Lemma 18.5.61 that R(y) is not a gradient Gibbs measure. Finally, since the 
averages ^(^"^WM^ 2 ^) - ^ 2 ^)] are a [ wa y S \ e ft unchanged by swapping maps, it is clear that 
S a (n) = S a (R(fjL)); thus, Theorem 18 1 b' 1 1 1 holds in this case. 

8.6.3 Infinitely repeated infinite cluster swaps 

In this section we deal with the case that there exists no height difference variable h for fi. 
Note that if either B + or B~ were finite with positive probability, then this B + or B~ would 
itself be a height difference variable for the measure /io equal to fi conditioned on this event, 
and we could apply the reflection of the previous section to the measure fj,Q. 

We may thus assume that B + and B~ are both almost surely not finite. Lemma 18.5.41 
implies that we cannot have either B + = — oo or B~ = oo with positive probability. Thus, 
B + = oo and B~ = —oo with positive probability, in which case there exist infinite clusters 
T~ for all values ceR. Recall that the sets T~ are increasing in c. For any x £ Z d , let F(x) 
be the smallest value c for which x £ T~ . 

Now, given a set A n , we can define a "single cluster swapping" operation R c that is 
measure preserving on the four-tuple (0i, 02, r, Fqa„) the same way we did in the previous 
section (except this time setting h = 0). We write R c ((j>i, 02, r, Fqa„) = (ipi, 4>2, s, Fqa„), 
where Fqa„ is left unchanged and (ipi,ip2,s) = R c ((pi, 4>2,r) (as defined in the previous 
section). This map is measure preserving on v\ ® v% (as defined in the previous section). 

Now, pick some positive value M £ E\ we will consider maps of the form Rum for 
k £ Z. Clearly, if kM > sup ;cg9An F(x), then the map Rum leaves (<px, 02, r, Fqa„) unchanged. 
Similarly, if kM < mf x( zgA„ F(x), then Rum (up to additive constants) simply permutes 
0i — 0i (x ) and 02 — 02 (^o) and makes no change to r. Now, write R = n^fci RkM where 
k\ is any odd integer for which k\M < inf xe gA n F(x) and k 2 is any integer for which k 2 M > 
sup xg(9An F(x). Note that up to additive constant, the map R is independent of the particular 
k\ and k 2 we choose. (Because the maps Rum permutes 0i and 02 when k < k\, this would 
not be true if we did not fix the parity of k\.) 

Taking limits of the R thus defined on increasingly large boxes, we can extend R to 
a function from Q to Q (see also the explicit description of R below). We now define the 
measure R(fi) on (f2, 5F ) as follows: to sample from R(fJ,), first choose 0i and 2 from /x; pick 
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the additive constants of 0i so that 0i(x o ) = 0, and choose the additive constant of 2 so 
that 02(^o) is uniformly distributed in [0, M). The proof that SFE(R(fi)) < SFE(fi) is now 
essentially the same as the proof given in single infinite cluster swap case. We first observe 
that FE\ n (/i') = FE\ n (R(fi')) where // and are measures on five-tuples defined in the 

previous section; the only differences are first, that we may now assume c G [0, M), since the 
map only depends on the value of c modulo M, and second, that F is defined differently. 
However, it is still easy show that the growth of \fi"K(fi^ 1 '^ 2 ' r,c ° , z/ 3 )| is o(|A„|) by using the 
fact that SFE(fi) is finite to show that the discrete derivative of F at every point in <9A„ 
has finite expectation. 

It now remains only to show that R(fi) is not a gradient Gibbs measure. We begin by 
giving a more explicit expression for the map R, We know that if F(x) < kM, then R kM 
fixes the pair (0i(a;), <p2(x)) if E = Z; if F(x) < kM, then R kM fixes the pair (^(x) , (f) 2 (x)) 
when E — R. (When E — R, the event that F(x) is exactly equal to kM will always have 
measure zero, so we ignore this case.) On the other hand, if F(x) > kM, then RkM sends the 
pair (0i (x), 02 (a;)) to the pair (0 2 (x) — kM, 4>\{x) + kM). And then R^-i)m sends that pair 
to (0i (a;) + M, 2 (x) — M). After successively applying R(k-2)M, ■ ■ ■ , R-m, we thus end up 
with the pair (^(x) + \M, <f> 2 (x) - §M) if k is even and (0 2 (x) - *±±M, 0i(x) + ^±±M) if k 
is odd. Write F M (x) = sup k:F ^ >kM . Then we have, for ip 1 and -0 2 , R(<f>i, 02, r ) = ("01, "02, s) 
where 



4>i{x) = 
ip 2 (x) = 



h(x) + ^^M F M (x) is even 
<fo(x) - ^1+1 M F M (x) is odd 



M x ) ~ Em t 1 M) F m (x) is even 
h(x) + Fm ^ +1 M F m {x) is odd 



2 


2 

We can take this as the formal definition of the repeated swapping map R on all of Q. 
As always, s(e) is determined on every edge by the requirement that R preserve the total 
energy on each edge. This gives us an explicit definition of R : 1— >• f2. We now need the 
following: 

Lemma 8.6.4 When x and y are neighboring vertices of7h d , the following are true: 

1. If F M (y) = F M (x) + 1 and F M (x) is even, then (ipi + M,ip 2 , s) is swappable at (x,y). 

2. If F M (y) = F M (x) + 1 and F M (x) is odd, then (-0i,-0 2 ,s) is swappable at (x,y). 

3. If F M {y) > F M (x) +2, then both (ipi,tp2, s) and (-0! + M, ip 2 , s) are swappable at (x, y). 

Proof This can be proved directly from the formal definition of R given above; however, it 
is most intuitively understood by tweaking the above to give yet another explicit formulation 
of R. Consider first a triplet (0i,02,r) with additive constants fixed. Now, it is clear (e.g., 
from above description), that the average is unchanged by swapping, i.e., a(x) ^ 1 ^ 2 = ^i±3^_ 
Thus, -01 and -02 are determined by the difference: S(x) — ^2 — We can write tyi{x) = 
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a(x) + 8(x)/2 and ipi(x) = a(x) — S(x)/2, and thus the energy at contained in ip 1 and t/> 2 at 



an edge e = (x,y) is W e = V(a(y) -a(x) + [S(y) -8(x)]/2) + V(a(y) -a(x) + [S(y) -8(x)]/2). 



Even if V is not a symmetric function, this expression is symmetric in 8{y) — 8{x). Denote 
by x( e ) the maximum value of [S(y) — 8(x)] for which the above expression is less than or 
equal to the combined energy contained in the triplet (<f>i,(f>2,r) at the edge e. Whatever 
swaps we perform, \S(y) — S(x)\ may not exceed x( e )- 

Now, denote by 7 the function defined analogously to 8 but using 0i and 2 instead of ipi 
and t/> 2 . Now, it is easy to check, that the set § c of places where 0i + c and 02 are swappable 
is precisely the set of points at which \2c — 7(2;) — j(y)\ < x( e )- If a swapping R c swaps the 
values of 0i + c and 2 at y and fixes the values at x then this has the affect of replacing 7(1/) 
with 2c — 7(1/) and leaving 7(2;) unchanged. In other words, the act of "swapping 0i(t/) + c 
and 02 (y) becomes then the act of reflecting 7(7/) across the horizontal axis of height c. Note 
that whenever 7(1/) and 7(2;) lie on opposite sides of c (or one of the values is equal to c) 
then e G S c . If 7(2;) and 7(1/) are on the same side of c, then e G S c if and only if a string of 
length x( e ) can stretch from 7(2;) to c and back to 7(7/): i.e., \j(x) — c\ + \ j(y) — c\ < x( e )- 

Now, we can extend 7 in a unique way to a continuous function on each closed edge 
e = (x, y) so that 7 is linear on each of the two half segments of e, 7 achieves its maximum 
at the midpoint m of e, and \^{m) — ^{x) \ + \^{m) — 7(7/) | = xi e )- Now, if either 7(2:) < c 
or 7(7/) < c, then we have e G S c if and only if 7 assumes the value c at some point along 
the edge e. 

If Fm(x) < Fjw(y), then, by definition, x and y are not in the same components of E d \S c 
where c G {Fm(x)M + M, Fm{x)M + 2M, . . . , FM(y)M}; in particular, this implies that 
(0i + c, 02, r) is swappable at e for c = Fm(x)M + M and c = Fm{jj)M. Since 7(2;) < Fm{x), 
this implies that 7 assumes all of the values F M (x)M + M, F M (x)M + 2M, . . . , F M (y)M at 
some point along the edge e. Now, the map R c can be extended to the continuous version of 
7 as follows: first extend T~ to continuous points by letting it contain not only the points 
in Z d defined to be in T~ before but also those points z on the interior of an edge for which 
7(2) < c and there is a path from z to a point in T~ along which 7 < c (or equivalently, 
all points z starting from which there exists an infinite-length, non-self-intersecting path 
along which 7 < c). As before, we write F(x) for the smallest value c for which x G T~, 
and Fm(x) = sup k . F ^ >kM . Then define R c (l) = 8 where S(z) = j(z) if z G T~ and 
= 2c — 7(2) otherwise. We can similarly extend the definition of R to the interiors of 
the edges by writing R(j) = S where 



It is clear that the total variation of 8 is equal to that of 7 along each edge e. If 
Fm(x) < FM(y), and zf m m + i, . . . , zp M ( y ) are the points along the edge from x to y at which 
7 first assumes the values Fm(x)M + M, Fm(x) + 2M, . . . , FM(y)M, then it is not hard to 
see that 5(zi) will be M if i is even and if % is odd (since F M (z) = F(z) = 7(2) at these 
points). From this, the lemma follows immediately. I 




7(2;) — Fm{x)M Fm(x) is even 

(F M (x) + l)M - 7(2;) F M (x) is odd. 
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Now, if we define S an d §m using the triple (V'i,^; s ); what are the infinite clusters 
of E d \§ ? If k is odd, then the above result implies that each of the edges separating an 
element of T~kM (defined using the original 0i,02 ; r ) from an element of its complement 
is in So; thus, for no odd k does an infinite cluster contain both a member of T^ M and a 
member of its complement. Since the T^ M are nested sets, it follows that the infinite cluster 
of the former must be contained in T^ M \T^ k+2 ^ M for some odd k. Now, if there is one such 
cluster, there must almost surely be infinitely many, since otherwise the minimum value of 
k for which such a cluster occurs in (ipi, ?/> 2; r ) would be a height difference variable for \i 
(and in this subsection, we are assuming that no height difference variable exists). If there 
are infinitely many infinite clusters with positive probability, then Lemma \S. 5. 61 implies that 
R(fi) is not a gradient Gibbs measure. A similar argument holds for the infinite clusters of 
E d \§A/- However, if there are no infinite clusters in the complement of either So or Sm, then 
Lemma 18.5.41 implies that R(fi) is not a gradient Gibbs measure. Finally, as in the previous 
section, since the averages ^ fa) - ^ 1 wl+IMj/lzjg ( x )\ are a i wa y S l e ft unchanged by swapping 
maps, we have S a (fi) = S a (R(fi)) and the statement of Theorem 18 . 6 . II follows . 



8.7 Height offset spectra 

From Theorem 18.6.31 we know that if E — R, then for u G U$ and simply attractive $, 
there exists a unique minimal gradient phase fi u on (Q, 3 rT ). In the case E — Z, the theorem 
implies that if there exists a rough measure \l u of slope it, it is also unique. In fact, Theorem 
18.6. II also implies the following: 

Lemma 8.7.1 If u G [7$ and either E = R or some minimal gradient phase fi u of slope u 
is rough, then there is a unique minimal gradient phase [i u of slope u and it is extremal. 

Proof Theorem 18 . 6 . II already gives uniqueness of \i u . If fi u fails to be extremal, then Lemma 
18.5.31 implies that with \i = fi u <S> fi u ® n positive probability, we have the strict inequality 
B + > B~ . If E = R or if E = Z and B + > B~ + 1 with /x-positive probability, then Theorem 
18.6. II implies a contradiction (through the same argument as in the proof of Theorem I8.6.3|) . 
Suppose on the other hand that E = Z and B + — B~ G {0, 1} almost surely. (Recall from 
Lemma 18.5.31 B + > B~ almost surely.) Then B + is a height difference variable for \x and 
hence \l u is smooth (by Lemma f8.4.4|) . I 

This section is devoted to the exceptional case that E — 7L, u G U, and every minimal 
gradient phase of slope u is smooth. In this case, \i = \i u may not be extremal, and we will 
determine its extremal components. Suppose that jji! is an extremal component chosen from 
w^. Then since \J is (w^-almost surely) smooth, we can view \J as a measure on Q (and we 
may choose the additive constant arbitrarily). Since the additive constant is an integer, the 
average expected value of // over any A CC Z d — taken modulo 1 — is independent of the 
additive constant. 

One way to extend \i to a Gibbs measure on (Q, J) is as follows; to sample from /z, first 
sample an extremal component fi' from w^; then treat // as a measure on (Q, 3), adding an 
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appropriate integer constant to cause the // expected height of 0(0) to lie in [0, 1). It is then 
clear that /i(0(O)) G [0, 1); moreover, by the definition of slope, G [(u, x), (u,x) + 1) for 
each ig£. 

Lemma 8.7.2 // [i is minimal gradient phase with slope u G U$, extended as above to a 
measure fi' on (Q, then for ^w^-almost all pairs of extremal Gibbs measures fj, 2 ), 
we have either fi[ -< -< + 1 or fj/ 2 -< -< ji' 2 + 1 . 

Proof From Theorem 18 .6. 11 we have that \i <S> fi u ® tt almost surely B + — B~ G {0, 1}. From 
Lemma 18.5.31 (and Lemma I3.2.3|) we have that <8> w M -almost surely, \il x -< fi 2 + c ^ M + 1 
for some value of c G Z. But since the expected value of 0(0) is in [0, 1) for ty^-almost all 
measures, we may assume that either c = or c = 1. In the former case, fi[ -< fi 2 -< fi[ + 1. 
In the latter case, // 2 -< /4 -< \il 2 + 1. I 

Theorem 8.7.3 Let fj, be a smooth minimal phase of slope u G U$. The following is a height 
offset variable, as defined in Section h{ji) = liminf |A n | _1 ^2 xeA <M X )- 

Proof Lemma 18 . 7. 21 implies that for each x G Z d , the distribution of the random variable 
//(0o) is supported in an interval of length one. In particular, for a point x G £, since 
(i(<f>(x)) G [(u, x), (u, x) + l), we may conclude that almost surely (for p,' chosen from w M ), 
|//(0(z)) - (u,x)\ < 2. 

Now, taking A ra to be the box of side length 2n + 1 centered at the origin, it follows that 

almost surely, 

h(u') = -2 < liminf ^ fi'(<f>(x)) < 2. 

If we write h(<p) = h(ir^) (as in Lemma l3.2.3|) . then it is not hard to see that h is a height 
offset variable (as defined in Section |H3J). 

We now claim that this h is in fact equivalent to the h given in the statement of the 
theorem. To see this, first write 4>h(x) = <f>(x) — h(4>) — (u,x); note that <ph is an 3 rT - 
measurable function. Now, by Lemma I8.2.5| u^-almost surely, the /i' distribution of 0(0) 
is log concave; by similar arguments to those above, we also have u> M almost surely that 
v' -< v -< v' + 1 where v and v' are the laws of 0(0) under /i and // respectively. It follows 
that the tails of v decay exponentially; in particular, \x has a finite expectation at every 
point in Z d , and also that /i(0^(x)) exists for all x G Z d and has finite expectation. By the 
ergodic theorem, we have liminf |A n | J2 x eA n 4>h{x) = limn->oo |A n | Y,x€A n <M X ) = 0, and 
the desired equivalence follows. I 

Applying Lemma [8.4.21 gives the following: 

Corollary 8.7.4 The height offset spectrum h(fi) is a measure on [0, 1) which is ergodic 
with respect to the maps x t— > x + U{ (mod 1) for 1 < i < d. In particular, if one of the 
components of [i is irrational, then h{p) is uniformly distributed. Also, if /i is extremal — so 
that h(fi) is a point measure — then we must have u6£, where £ is the dual lattice of L. 
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Theorem 8.7.5 If Hi and /i 2 are minimal gradient phases with the same slope u G U$ and 
the same height offset spectrum v G CP( [0, 1)) ; then \i\ = /x 2 - 

Proof For any e > 0, we take fi e to be \i\ x /i 2 conditioned on the event A e that the distance 
between h(<j>i) and /i(0 2 ) on [0, 1) (viewed as a circle) is at most e; note that this event 
occurs with positive probability. Since fi e and /ii x /i 2 conditioned on the complement of 
A e are both iL-invariant gradient Gibbs measures, and since SFE(fi) = 2a(u) is minimal, it 
follows from the affine property of SFE that SFE(fi e ) = 2a(u). Note that the marginals of 
fx e are /ii and /x 2 - Letting e tend to zero, by Theorem I2.4.2( there is a limit point ji Q with 
SFE(fi ) = SFE(ijl) and at which h(<px) = h{4>%), /io-almost surely 

As in the proof of Lemma f8. 7. 31 we note that /^-almost surely (for appropriate choice of 
additive constants), rr^ 1 -< n^ 2 -< tt 1 ^ 1 + 1. In particular, this implies h(4>i) < h((j>2) < ^(0i) + 
1; since h(<fii) and fo(0 2 ) agree, /x -almost surely, modulo one, we have either h(<f>i) = ft-(</> 2 ) 
or h(<p2) = h((pi) + 1. Assume without loss of generality that the former is the case. Since 
7T </ , 2 _^ ^01 an application of the ergodic theorem implies that vr* 1 = 7r^ 2 (otherwise, 
/i(0i) 7^ /i(0 2 ) modulo one). I 

Theorem 18.7.51 and Corollary 18.7.41 imply the following: 

Corollary 8.7.6 If u G U$ and one of the components of u is irrational, then the minimal 
gradient phase \x u of slope u is unique. 

We also have: 

Corollary 8.7.7 If u G U§, all of the components of u are rational, then the smallest 
positive rational number obtained as (u, x), for ie£, has the form 1/n for some nSZ, and 
each minimal gradient phase \i u of slope u has height offset spectrum given by the uniform 
measure on {c, c + 1/n, c + 2/n, . . . , c + (n — l)/n} for some c G [0, 1/n). 

Proof The maps g x , giving translation of [0, 1) by (u, x) modulo 1, are elements of the group 
of all rotations of [0, 1). The map x — »• g x is a homomorphism from the additive group L into 
this abelian group. Since u is rational, its image is a finite subgroup of the set of rotations; 
letting n be the order of this group, the result follows from Corollary 18.7.41 and the fact 
that every measure on [0, 1) which is ergodic under translations by this group is given by a 
uniform measure on {c, c + 1/n, c + 2/n, . . . , c + (n — l)/n} for some c G [0, 1/n). I 

Finally, we would like to describe precisely the way in which fi u decomposes into extremal 
measures — one extremal measure for each "height offset" value modulo 1. We do this first 
for the irrational case. In this lemma, we say a function / : Q t— > R is said to be increasing 
if for each 0i, 2 G Q with X < 2 , we have f{4>\) < /(02)- We say / is decreasing if — / is 
increasing. We say an event A G 3 is increasing (decreasing) if 1^ is increasing (decreasing). 

Theorem 8.7.8 Suppose u G U$>, one of the components of u is irrational, and /i„ is the 
unique smooth minimal gradient phase of slope u. Then there exists a unique family fi u a of 
extremal Gibbs measures (one for each a G M.) on (Q, 5F) with all of the following properties: 
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1. Height offset property: For each a G R, we have that for fi Uja almost all cj), h((j)) = a. 

2. Stochastic domination property: fJL u ,ai stochastically dominates n u ,a 2 whenever 
ai > a 2 . 

3. Vertical translational symmetry: For each b G Z, we have b + // u>a = fi u ,a+b- 

4- Decomposition property: The restriction of J c+1 [i u ,ada to J T is equal to fi u for 
every c6K. 

5. Right-continuity: For each increasing event A G Q, fi Uya (A) is increasing and right- 
continuous in a. 

6. Extremality: For all a6l, fi Uja is extremal. 

7. Horizontal translational symmetry: For each x G £ and a G M, we have 9 X ^ UA = 

Proof First, we will prove uniqueness by showing that there is at most one definition of the 
H u>a which satisfies all of the above properties. Fix a c G K and extend \i u to (ft, £F) in such 
a way that h is \i u almost surely in [c, c+ 1). By the decomposition property and the height 
offset property, we can write \i u = f ^ 1 fj, Uta da. 

Now, for any Ci,c 2 with c < C\ < c 2 < c + 1, we write /i«,( Cl ,c 2 ) f° r the measure \i u 
conditioned on the positive-probability event h G (ci,c 2 ); since height offset modulo one is 
^-measurable, the decomposition and height offset properties also imply that /i U) ( Cl ,c 2 ) = 
Sl^^Hu^ada- The right continuity property implies that for each A and a G [c, c + 1), we 
have = lim fc ^ fi u ,(a,a+b)(A) (where b — > from the right). Since the increasing events 

generate the cr-algebra 5", any two measures which agree on increasing events must agree 
on all measurable events in SF; thus, if there exists a measure /^ U)0 for which /i nia (A) = 
lim^o AV(a,a+fc)(^4) ^ or an increasing A, that measure is unique. 

But we still have to prove that such a measure in fact exists. First, we claim that this 
limit exists for every increasing set A and for every a G [c, c + 1). To see this, first observe 
that fiu,{ ai ,a 2 ) ^ A*u,(a 3 ,o 4 ) whenever < a 1 < a 2 < a 3 < a 4 < 1. To show this, it is enough 
to note from Lemma 18.5.31 and Theorem 18.6.11 that /i Ui ( 0l)02 ) <8> ^ u ,{a s ,a A ) ® tt almost surely, 
B + — B~ = 1. This implies that for extremal measures (z/i,z/ 2 ) chosen from the extremal 
decompositions of these measures, we almost surely have v\ -< v 2 + a -< v\ + 1 for some 
value of a G Z; but since the height almost surely satisfies h(v\) < h(u 2 ) < h(v\) + 1, we 
may conclude that a = 0. In fact, we can also note that /i U) ( 0l ,o 2 ) ^ A*«,(oi,o 3 ) whenever 
< ai < a 2 < 03 < 1; this follows from the fact that [i u> ( aija3 ) is a weighted average of 
AV(ai,a 2 ) and a measure —namely, /x u ,( a2i a 3 )— which dominates Ay( ai ,a 2 )- 

This implies that for every increasing event AeJ, A i ? 1 ,(a,a+f))(^4) is a decreasing function 
of 6, and hence has a limit as 6 tends to zero. We would like to extend this convergence to all 
measurable sets A. To this end, first, the reader may easily check that the set of measures \i 
for which \i u — 1 -< ji -< fi u + 1 is sequentially compact in the topology of local convergence. 
Thus, for some subsequence of values of b tending to zero, the limit lim^o Hu,(a,a+b) exists 



127 



as a measure on (f2,5F), which we denote by n u>a . Since the value of n u>a on increasing sets 
is independent of the subsequence — when the subsequence is chosen so that n u>a is in fact a 
measure — then the value of n u>a on all sets is independent of the subsequence (since increases 
sets generate 5F). 

We can take the above limit as a definition for fi u ^ a for each a G [c, c+ 1); the extension to 
all a is determined by the vertical translation property: for a G Z, we have /x u ,a+a = A*«,a + a - 
Now we must verify the list of properties given above. It is not hard to see that the above 
definition is independent of the choice of c; the decomposition property follows immediately. 
Next, observe that f-i u ,(a-b,a) ^ ^u,a ~< Hu,(a,a+b) for all a and b > 0; letting b tend to zero, 
the fact that h = a for /i U)Q almost all follows from the definition of h. The stochastic 
domination property follows from the fact that n u>a -< fi u ,( a ,b) -< /•*«,& whenever a < b. 

Now, from the stochastic domination property, it is clear that for every increasing cylinder 
set A, {j, u>a (A) is increasing in a. The decomposition property and height offset property 
imply that [i u ^ a +b)(A) is the average of fj, UjCl r(A) over a G (a, a + 6); since fi Uta (A) is the limit 
of these values as b tends to zero, it follows that fi u ^ a (A) is right-continuous in A. 

We still need to verify extremality and the horizontal translational symmetry. We will 
first check that these properties hold for Lebesgue almost all a and then use continuity 
arguments to extend then to all a. 

To see almost-sure extremality, let // be the measure on triplets obtained as follows. To 
sample from fi b , first choose a uniformly in [0, 1) and then sample 0i and 02 independently 
from fi u ,( a ,a+b)- Let /i be the limit of these measures as b tends to zero. As in the proof of 
Theorem 18.7.51 it is not hard to see that this limit has minimal specific free energy and that 
B + = B~ almost surely. Using the limit definition of /i Uja , it is also not hard to see that the 
following is an equivalent definition of /x: to sample from fi, first choose a uniformly from 
[0, 1) and then sample (0i, 02, r) from ® /jL U)a ® n. 

Next, if the // u>a were not extremal for almost all a, then for some e, 5 > 0, there would 
be an e fraction of a values in [0, 1) for which the probability that two extremal measures 
independently sampled from a are different is at least S. But in this case, by Lemma 
18.5.31 we would have to have B + ^ B~ with probability at least 5e, a contradiction. The 
horizontal translation symmetry argument is the same, except that in this case, to sample 
from /i, we first choose a uniformly in [0,1), then choose 0i from /i Uja and 02 from the 
measure ^A i n,a-(«,x) (which has the same height almost surely as /i Uja by the definition of 
height offset variables). 

Now, suppose that fi UtCL is not extremal; then it can be written as pui + (1 — p)v 2 for 
some < p < 1 and Gibbs measures z/i and z/2 which differ on at least on increasing 
event: without loss of generality, say A is increasing in Q and ^i(^4) < 1^2 {A). Now, write 
/(0) = tt^(A) — we can think of, /(0) as describing the probability of A in the extremal 
measure from which was chosen. For an decreasing sequence of values ctj lower limit is a, 
we have n u , ai extremal, which implies that tt^(A) is /i Uj(li -almost surely constant for each i. 
Thus, Tr^(A) is \i u , ai almost surely equal to fi U}(H (A) for each i. By right continuity, we know 
that fi u , a {A) = lim^oo/i^. And this is in turn equal to fi u ,a{f)- Since f^,a -< Hu,a t for each 
i, the law of /(0) when chosen from [i Ujaj dominates the law of /(0) when is chosen 
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from This implies that f((f>) (when / is sampled from fi u>a ) is dominated by a sequence 
of constant random variables whose values converge to // u ,o(/)j this implies that, for 
almost all 0, we have /(</>) < f^ u ,a{f), which implies that / is /i U)0 -almost surely constant, a 
contradiction. 

The horizontal translational symmetry argument is simpler; we observe that from the 
almost-sure invariance that whenever 6 < c < 6 + 1 , we have 



Taking limits as c approaches b from above gives the result. I 

The rational case of Theorem 18.7.81 is straightforward and the proof is similar: 

Theorem 8.7.9 Suppose u G U$, all of the components of u are rational, and \i u is a 
smooth minimal gradient phase of slope u with height spectrum given by uniform measure 
on {c, c + 1/n, . . . , c + (n — l)/n}. Then there exists a unique family fi Uja of extremal Gibbs 
measures on (Q, ^F) (one for each a G c + ^ZJ with all of the following properties: 

1. Height offset property: For each a G c + we have that for fj, u>a almost all (j>, 



2. Stochastic domination property: // u , ai stochastically dominates jj, u ,a 2 whenever 



3. Vertical translational symmetry: For each b G Z ; we have b + /i„ )0 = jj, u ,a+b- 

4- Decomposition property: The restriction of J2i=o A*u a+i t° ^ T ^ s e Q ua ^ t° A*u f or 
every a G c + -Z. 

5. Extremality: For all a G c + ^Z ; /i u _ a is extremal. 

6. Horizontal translational symmetry: For each x G £ and a G M 7 w e have 9 x /j, u<a = 



8.8 Example of slope-w gradient phase multiplicity 

Here, we give an example of an LSAP $ and a slope u G U$ for which the gradient phase of 
slope u is not unique (and a sketch of the proof that it is not unique). For any x G Z 3 , write 





h((f)) = a. 




Either zero or one component of x is odd. 

1 Either two or three components of x are odd. 
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Now, consider the LSAP $ defined as follows. When e(x) = e(y), we have: 



rj = 



When e(x) ^ e(y), write 

^x,i/ fa) 



oo otherwise 



rj G {0,e(y)-e(x)} 
oo otherwise 



2 • 



In order to observe the symmetries of this potential better, we replace <fi{x) with <p(x) — 
Now, (f>(x) assumes values not in Z, necessarily, but in Z + modify $ accordingly. In the 
modified system, when e(x) = e(y), then Q x ^((f)) is if <j)(x) = <fi(y), C if \<f)(x) — <p(y)\ = 1 
and infinity otherwise. But when e(x) ^ e(y), we have $ Xty ((j)) = if \<fi(x) — <fi(y)\ = \ and 
oo otherwise. The reader may check that U<$, is the interior of a symmetric polyhedron with 
the zero slope in its interior. Note that here, $ restricted to the set of x for which e(x) = 
(respectively, e(x) = 1) is the Ising potential on that set; the only difference is that <p(x) is 
allowed to assume values in Z (respectively, \ + Z), instead of merely values in {0, 1}, as in 
the Ising model. 

Now, take £ = 2Z 3 . This potential has two ^-invariant ground states (as defined in 
Definition 6.18 of we have + (x) = for all x G Z d , and 4>~(x) = —^p- for all 

x G Z d . (Each one is defined only up to an additive constant.) A standard argument due 
to Peierls (see, e.g. Theorem 6.9, Theorem 18.25 of j3Hl> and the surrounding discussion) 
implies that for C sufficiently large, there will be a slope zero minimal gradient phase which 
is a small perturbation of each of these ground states; in particular, there is more than one 
slope zero minimal gradient phase. 

The rough essence of these arguments is as follows: first, SFE(fi) = 0, when [i is one of 
the two ground states. Thus, any [i for which SFE(fi) is minimal must satisfy SFE(fi) < 0. 
Recall that we can represent SFE(fi) as minus the entropy of /i (i.e., SFE^^fi) where $ is 
the potential which is identically zero) plus the expected "energy per site" of fi, which we will 
write by /u($). The former value is clearly at most log 2 (since the number of finite-energy 
configurations on an n x n box is at most 2 n ), which implies that if fi is minimal, then we 
must have < log 2. This implies that the fraction of vertices whose heights differ from 
those of a neighbor by 1 is bounded above by Define a ground state cluster to be 
a maximal connected set of vertices on which is equal (up to an additive constant) to one 
of the two ground states. Next, one samples a configuration <ft on an n x n torus and uses 
entropy considerations to argue that the probability that there exists a cluster of size larger 
than c containing the origin tends to zero exponentially in c. Thus, a typical configuration 
contains one large ground state cluster with small "islands" spread throughout. Taking a 
weak limit as n tends to infinity, one obtains a smooth shift-invariant measure which is 
equal to one of the ground states on an infinite cluster ground state cluster with small finite 
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islands. By symmetry, there exists such a measure for each of the two ground states. Again, 
the reader should consult 03] for full details. (The proof of Theorem 6.9 in 03] is easier to 
read than the more general proof in Chapter 18 and contains all of the ideas needed for the 
above example.) 

Similar constructions to these give gradient phase multiplicity in dimensions higher than 
than three. In fact, if we relax the condition that $ be a nearest neighbor potential — allowing 
<fe x y to be nonzero whenever either \x — y\ = 1 (x and y are adjacent) or x — y G (±1, ±1) 
(x and y are "diagonally adjacent") — then we can construct a similar example when d = 2. 
In this case, define e(x) to be or 1 as the sum of the coordinates of x is respectively even 
and odd. Exactly as in the previous example, when e(x) = e(y), then we take $^^(0) to be 
if <p(x) = 4>(y), C if \4>{x) — 4>(y)\ = 1 and infinity otherwise. And when e(x) ^ e(y), we 
have $ x , y ((j)) = if \4>(x) — (f>(y)\ = \ and oo otherwise. Take L = 21? . As in the previous 
example, there are two distinct ^-invariant ground states, and a Peierls argument implies 
that at sufficiently low temperature, there exist at least two distinct minimal £-ergodic 
gradient phases, each of which is a small perturbation of one of these ground states. 

Given the simplicity of the above examples, it is perhaps surprising that — as we will see 
in Chapter — minimal gradient phase multiplicity never occurs for u £ U$ when d = 2 and 
$ is a simply attractive potential. 
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Chapter 9 

Discrete, two-dimensional gradient 
phases 

9.1 Height offsets and main result 

Throughout this chapter, we assume m — 1 and d — 2, E — Z, and $ is simply attractive. 
When this is the case, we can completely classify the minimal gradient phases. In the 
previous chapter, we proved, for general d > 1, that when E = R, there is a unique minimal 
gradient phase of each slope u G U$, and this phase is extremal. When E = Z, we found, 
for d > 1, that if there exists a rough minimal gradient phase of slope u G £/$, then it is 
the unique minimal gradient phase of slope u. We also found, when E — Z and d > 1, that 
each smooth minimal gradient phase of slope w G is completely determined by its "height 
offset spectrum" — the measure on [0, 1) given by h(fi) modulo 1. The main purpose of this 
chapter is to show that, in two dimensional systems, this spectrum is trivial and the minimal 
gradient phase of slope u is unique: 

Theorem 9.1.1 Suppose that d = 2, E = Z, and $ is simply attractive. Then for every 
u G Uq>, there exists a unique minimal gradient phase /i u of slope u. This fi u is extremal. In 
particular, if fi u is a smooth phase, then its height offset spectrum is a point mass. 

By Lemma 18.4.21 the height offset spectrum of [i u can only be a point mass if fi G £. 
This implies the following corollary: 

Corollary 9.1.2 Each fi u described in Theorem \9.1.1\ is a rough phase unless mg£. 

We refer to the slopes in SL for which fi u is smooth as smooth slopes. In [64J, a work 
by this author and two other authors, we show how to determine — for a class of Lipschitz 
simply attractive potentials $ based on perfect matchings of periodic bipartite graphs — 
precisely which of the possible smooth slopes are smooth. Depending on $, none, all, or some 
nonempty proper subset of the vertices of £ fl will be smooth slopes. (We also prove, in 
that context, a direct correspondence between the smooth slopes and the non-strictly-convex 
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regions — a.k.a. facets — of certain surface-tension-minimizing surfaces subject to boundary 
and volume constraints.) In general, we do not know how to determine explicitly which of 
the slopes in £ D £/$ are smooth for any other families of two dimensional simply attractive 
models. 

9.2 FKG inequality 

Let n be a probability density on Q that is a finite combination of point measures (i.e., 
measures supported on a single G O). We say that /i /ias i7ie MTP 2 (multivariate total 
positivity) property if /i(0i)/i(0 2 ) < /t(max(0i, 02))//(min(0 1 , 02)) for all 0i, 2 : A i— > 

As in the previous chapter, we say function / : fi i— > M is said to be increasing if for each 
0i, 02 G with 0! < 02, we have /(0i) < f(4>2)- We say / is decreasing if — / is increasing. 
We say an event A e 3 is increasing (decreasing) if 1a is increasing (decreasing). The FKG 
inequality states that whenever \i has the MTP 2 property, any two bounded, increasing 
functions / and g on Q are non-negatively correlated; i.e., fj,(fg) > M/OMflO- (^ ee the 
original paper by Fortuin, Kastelyn, and Ginibre |32j.) In particular, this implies that any 
two increasing events (or any two decreasing events) are non-negatively correlated. We say a 
general measure /i on (Q, J) satisfies the FKG inequality if each pair of increasing events in 3 
is non-negatively correlated. (This implies the analogous statement about general bounded 
functions / and g, as is easily seen by approximating / and g with step functions.) 

We say a potential $ is submodular if for every A CC Z d , $a has the property that 

$a(0i) + $a(0 2 ) > $ A (min(0i, 2 )) + $ A (max(0 1 , 2 )). 

In particular, every simply attractive potential is submodular. Since $ is submodular 
and Z\((p) is finite, then it is clear that 7*(-, 0) has the MTP 2 property and hence satisfies 
the FKG inequality. 

Lemma 9.2.1 //$ is simply attractive and [l is an extremal Gibbs measure, then \i satisfies 
the FKG inequality. 

Proof Let A and B be increasing events in 3. By the reverse martingale theorem and the 
tail triviality of /x, we have 7a„(^4|0) — > f^{A) for //-almost all G £1; the same is true of 
increasing events B and A (7 B. Since each 7a„( - |0) satisfies the FKG inequality, the result 
follows. I 

Lemma 9.2.2 Let fi u be a smooth minimal gradient phase with slope u G U$. Extend \x u 
to (Q, in such a way that h(4>) is fi u -almost surely in [0,1). Then \i u satisfies the FKG 
inequality. 

Proof Let A and B be increasing events in 9 r . By Theorems 18 . 7. 81 and 18 . 7. flj we can write 
/t„ = f Q fi u , a da (if u has an irrational component) or \i u = Y^!i=o AVc+i/n (f° r some n, if u is 
rational). The same theorems imply that n u<a (A), fi u>a (B), and fi u>a (AnB) are all increasing 
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functions in a. Moreover, Lemma 19.2.21 implies that ix u ,a{A D B) > fi u ^ a (A) [i u ^ a (B) for each 
a. 

In the irrational case, we have: 



fi u (AnB)= [ /i u , a (AnB)da> [ n u>a (A)/i u , a (B). 
Jo Jo 

Applying the FKG inequality to the increasing (in a) functions f(a) = n u>a {A) and g(a) 
fi U)Cl (B) and the uniform measure on a G [0, 1), we can then say 



^u,a(A) jJ u ^ a (B) > / /l u ,a(A) / Hu,a( B ) = ^u(A) /J, U (B) . 



A similar argument applies in the rational case; in this case, we replace uniform measure on 
[0, 1) with uniform measure on {c, c + 1/n, . . . , c + (n — l)/n}. I 

We present one more straightforward fact about the FKG inequality. 

Lemma 9.2.3 // \x satisfies the FKG inequality and v satisfies the FKG inequality, then 
fi <S> v satisfies the FKG inequality. 

Proof We aim to show 

l A (x,y)l B (x,y)v(dy)ii(dx)> / / l A (x,y)u(dy)fi(dx) / / l B (x,y)v(dy)(jt(dx), 




where A and B are increasing events and the integrals are over the spaces on which \i 
and v are defined. The functions = J lA(x,y)u(dy), Jb(x) = J lB(x,y)v(dy), and 

fAnB^x) = J lAr\B(x,y)u(dy) are clearly increasing functions of x; moreover, f A (x)fB(x) < 
fAr\s{x) pointwise by the FKG inequality for v. Combining this with the FKG inequality 
for /j,, we have 




l An B(x,y)u(dy)iJ,(dx) = j f A nB^(dx) 

> / fA(x)f B (x)fi(dx) 



> I f A (x)n(dx) j f B (x)fi(dx) 

l A (x,y)v(dy)fi(dx) / / l B {x,y)v{dy)(i(dx). 
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9.3 Reduction to statement about {0, 1} measures 

In this section, we let f2r be the space of subsets r of Z 2 ; we may think of elements of fir 
as functions (p : Z 2 i— > {0, 1}, where <f>(x) = 1 if and only if x G V . Let 3^ be the usual 
product a-algebra on fir- We will derive the main result of this chapter, Theorem I9.1.H as 
a consequence of the following theorem: 

Theorem 9.3.1 There exists no measure p on (Op, 9^) which possesses all of the following 
properties: 

1. p satisfies the FKG inequality — i.e., there exist no two increasing events A and BofT 
for which p(A)p(B) < p(A n B). 

2. With positive p probability one, one of the following events occurs (the second occurring 
with positive probability) 



(b) T and Z 2 \r are both infinite connected subsets ofJ?. (Equivalently, ifT is treated 
in the dual sense as a subset of lattice squares of 1? , then the boundary between 
T and its complement consists of a single infinite, non-self-intersecting path Pp.) 

3. The random infinite boundary path Pr described in the previous item — conditioned on 
such a path existing — has a law that is SL-invariant. 

(The reader may verify the equivalence asserted in the second statement.) 

We will prove Theorem 19.3.11 in the next section. In this section, we prove that Theorem 
19.3.11 implies Theorem 19.1.11 To do this, we show that if either p is a non-extremal minimal 
gradient phase of slope u e Uq> or pi and p<i are distinct extremal minimal gradient phases 
of slope u 6 U$, then we can construct a measure p which violates Theorem 19.3.11 

In the former case, write pZ = p <g> p <g> n <g> it , where it is the measure on S as defined as 
in Section 18.31 here, we will view p as being extended to 3") in such a way that h(4>) is 
/x-almost surely in [0, 1). By Lemma \9. 2. 21 p defined on (O, ^F) in this way satisfies the FKG 
inequality. Now, given 2 , r i, r 2) sampled from p, we define r on an edge e = (x,y) by 



Clearly, (0i, 2 , f) has the same law as p ® p <E> n. Our reason for introducing ri and r 2 is 
that certain random sets (described below) are monotone in these r\ and r 2 — so that the 
FKG inequality applies — even though they are not monotone in r. 

Namely, we claim that the indicator functions of T$ — defined in terms of (0i, 2 , r), as in 
Section l5.5.1l — and Z 2 \T 1 ~ are both increasing functions of the four-tuple (0 2 , — 0i, r 2 , —T\). 
(Recall that Tq is an infinite set on which 2 > <pi and T-f" as a infinite set on which 2 < <f>\.) 



(a) T = 




n(e) (j> 2 (x) > (f>!(x) and <p 2 (y) > <f>i(y) 
r 2 (e) otherwise 
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To see this, suppose that (0' x , 2 , r[, r' 2 ) are such that 

(02, -<t>'l, r 2> ~ r 'l) > (02, -01, r 2 , ~n). 

Let A + be the set of points on which 2 (x) > 4>i( x ) an d A_ the complement — i.e., the set of 
points on which 02 (x) < <f>i(x) + 1; define t4' + and analogously. By definition, Tq is the 
set of points in those infinite components of Sq (the complement of So) which are contained 
in A + . (Recall that any edge connecting a vertex in A + to a vertex in A~ necessarily belongs 
to Sq.) Clearly, A' + is a superset of A + . Now, we would like to show that if e = (x, y), with 
A + , and e G Sq, then we also have e G (Sq)'. This will imply that (T + )' is indeed a 
superset of T + . (Here (T + )' and (Sq)' are defined in the obvious way, using (0^, 2 , r[, r' 2 ) 
instead of (0i, 2 , r 1; r 2 ).) Note first that since — ri > — rf we have r'(e) < r(e). Second, 
the amount of energy required for a swap is 

(v x , y (Mv) - M x )) + v x , y (Mv) -M x )))- 
(v x , y (Mv) - 02 (x)) + v XiV (My) -<f>i( x )))- 

Since V XyV is convex and fafa) > 4>i( x )i it is clear that increasing 4>2(y) can only increase 
the value of the above expression. Similar observations show that increasing (f>2(x), decreasing 
4>i(y), and decreasing 4>\(x) can also only increase the value of the above expression. It follows 
that if e G Sq, we must also have e G (Sq)'] and thus Tq*" is an increasing function of the 
four-tuple as claimed. A similar argument shows that the indicator of T-f is a decreasing 
function of the same four-tuple, and hence Z 2 \T 1 ~ is an increasing function. 

Recall by Theorem 18 . 7. 81 and Theorem 18 . 7. 91 that fi can be written fi = fJ, Uja da (if u has 
an irrational component) or /i = ^27=0 ^u,c+i/n (for some n, if u is rational). Thus, sampling 
(0i,02) from fi ® /i is equivalent to first independently sampling a\ and a 2 from uniform 
measure on either [0, 1) or {c, c + 1/n, . . . , c + (n — l)/n}, and then sampling (0i, 02) from 
(j^m,oi ® A t « i a 2 )- m either case, there is a /i ® /i positive probability that a 2 > ai, and thus 
^-(02) > h(4>i). By Lemma 18.7.21 we have n^ 1 -< 7r^ 2 -< it^ 1 + 1 in this case; by Lemma 
18.5.31 this implies that B + = 1 and S_ = — and thus, both T + and T{~ are non-empty 
(conditioned on a 2 > ai). By Lemma T8.5.61 each of these sets is /Z-almost surely a single 
infinite cluster; these clusters are clearly disjoint. 

Define T^ to be the union of Tj" and all finite components of its complement. Clearly, 
T + is also increasing in the four-tuple, and both it an its complement are connected. The 
same is true of Z 2 \T 1 ~, defined similarly. When Tq*" and T-f" are disjoint, it is not hard to see 
that T-f and T + are also disjoint; in particular, they are both infinite connected sets with 
infinite connected complements. 

An important observation is that Tf and T + are gradient measurable functions of the 
four-tuple: that is, given (0 1; 2 , r 1; r 2 ) with 0i and 2 defined only up to additive constant, 
we can /i®/i(g>7r£g>7r-almost surely determine the additive constants that make < h(<f>i) < 1) 
for i G {1, 2}, and this determines Tf and T Q + . 

A natural question to ask now is this: if we translate the four-tuple (0i, 2 , r 1; r 2 ) (with 
0i and 02 defined only up to additive constant) by some x G <C, does this have the affect 



136 



of translating T{~ and by x7 The answer is almost yes. Only the values of h{<f>\) and 
h((j) 2 ) modulo one are defined in the gradient cr-algebra; when additive constants are chosen 
so both values are in [0, 1), write a = h((f) 2 ) — M0i)> assume without loss of generality that 
a > 0. In any case, we will have — 1 < a < 1. 

Now, translating the four-tuple by x has the affect of changing h{<pi) and h(4> 2 ) by (u, x); 
now, if we add integer constants to make h((f>i) + (u,x) and h(c/) 2 ) + {u,x) he in [0, 1), we 
will not necessarily find h(cf) 2 ) — h((f>i) = a; if adding (u, x) does not change the integer parts 
of h((pi) and h(<fr 2 ) by the same amounts, we may find instead that h(<ft 2 ) — h(4>i) = a — 1. 
But regardless of x, we will have either h(4> 2 ) — h(4>i) = a or h((p 2 ) — h(<f>i) = a — 1. In the 
first case, translating the four-tuple (defined up to additive constant) by x does indeed have 
the effect of translating T" 1 and Tq by x. In the second case, it has the effect of translating 
T _1 and T + by x and swapping the two sets. (The reader may easily check that adding 1 
to 4>i and subsequently the labels of (j>\ and (j> 2 does indeed have this effect.) 

Thus, translating (0i, (f> 2 , ri,r 2 ) by x also translates the unordered pair of sets (J 1-1 , T + ) 
by x. Since the law of this four-tuple is ^-invariant on the gradient a-algebra, this implies 
that the law of the unordered pair (T _1 ,T + ) is ^-invariant as well. 

Now, we are ready to define our random set T. Choose \ uniformly from the two-element 
set {0, 1}, and choose the four-tuple (0i, (fi 2 , r 1 , r 2 ) from ~p. We now write: 

fT + h((h)>h((f> 1 ),x = 
T=lz 2 \f 1 - h{cf> 2 )>h(<f> 1 ),x = l 
y otherwise 

By the discussion above, the boundary between T and its complement has an ^-invariant 
law. Also, since T + C Z 2 \T 1 _ , the indicator function of T is an increasing function of 
(0i, — 02, r 1; — r 2 ) and \. Each of the five independent components satisfies the FKG in- 
equality, and it follows from Lemma F9.2.3I that increasing events in this five-tuple are non- 
negatively correlated. In particular, any two increasing functions of V are non-negatively 
correlated; in other words, V satisfies the FKG inequality. Thus, the random set V — which 
we produced using a non-extremal minimal gradient phase fi u of slope u G U<s> — is in vio- 
lation of Theorem 19.3.11 We conclude that no non-extremal minimal gradient phase fi u of 
slope u G U$ can exist. 

Now, recall that we also promised to used Theorem 19 . 3 . 1 1 to rule out the existence distinct 
minimal gradient phases /ii and fi 2 of slope u G U$. The above argument implies that both 
such measures must be extremal. Assume such measures exist and, without loss of generality, 
h(fi 2 ) > h(fjLi). In this case, we can take Jl = ^i®^l 2 ®ti®'k and simply take Y = T + as 
defined above. As before, it is clear that this set satisfies the FKG inequality; and in this 
case, T = Tq*" does have a ^-invariant law. Almost surely, it is connected and has a connected 
complement and hence its existence contradicts Theorem 19.1.11 We have now proved that 
Theorem 19.3.11 implies Theorem 19.1.11 
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9.4 Proof of statement about {0, 1} measures 
9.4.1 Definitions and overview of proof 

In this section, we prove Theorem 19.3.11 Our proof is in some ways similar to the orig- 
inal Russo-Seymore- Welsh proof of the non-existence of infinite clusters in critical two- 
dimensional Bernoulli percolation (see, e.g., [5TJ for this proof and many relevant references). 
However, their lemma and their proof relied heavily on \x having reflection symmetries as 
well as translational symmetries. Because we are not assuming any reflection symmetries, 
our construction will require a little bit more machinery than the analogous construction in 
[oT] , including some topological results. 

Suppose that p is as described in Theorem 19.3.11 Let be the event that V and Z 2 \r 
are infinite connected sets; by assumption, p{A 00 ) > 0. Let A% be the event that T = 0. 
Also by assumption, p{A OQ U A®) = 1. Now, we take e to be an extremely small fixed 
constant: e = 10 _10000 p(Ax)) 10000 will comfortably suffice for all of our arguments. Let Ak 
be the event that A^ occurs and that both T fl and {l?\Ak) fl are non-empty. (In 
this section, we assume for convenience that is shifted to be centered at the origin — i.e., 
A fc = [[-k/2\, [k/2 - 1J] 2 C Z 2 .) Choose k large enough so that p(A k ) > p(A tXl ) - e. 

Let Bk be the event that there exists a path — consisting entirely of elements in T\Afc— 
which encircles the set A&. Clearly, Bk is disjoint from both Ak and A$, and hence p{Bk) < e. 
We will use topological arguments and the FKG inequality to prove that p{Bk) > e — hence 
proving Theorem 19 . 3 . 1 1 bv contradiction. 

Fix an integer n > 2k. Denote by A(t>) the "shifted box" nv + A& and write A = 
U t)gZ 2A(t>). Write A(v) = nv + Ak+i- Let A(v) be the outer band of square faces around 
A(v) — i.e., the set of square faces of 7L d that are incident to at least one vertex of A(v) and 
at least one vertex of A(v)\A(v). Also, take A = Yl v&Z dA(v). See Figure l9~T1 for a dual 
version (i.e., squares depicted as vertices and vice versa) of this picture. 

We can think of the path Pp — which is a sequence of square faces of Z 2 (or equivalently, 
a sequence of vertices in the dual graph) — as being oriented in such a way that T lies on its 
left. Denote by A(v) the event that Pr hits A„, and in between the first and last times Pr 
hits A(v), Pp hits no square which is fewer steps away from a A(w), with w ^ v, than it is 
from A(v). For example, A(v) occurs whenever A(v) is one of the 4x4 boxes in Figure l9~T1 
except when A(v) is the top left box. 

Clearly, by choosing n large enough so that the probability that Pp takes more than n — k 
steps between visits to A(v) is small, we can make the probability of A^A^v) arbitrarily 
small. 

We will henceforth assume that in additional to satisfying n > 2k, we also have that 
nZ 2 C £ and that n is large enough so that p(A(0)) > p{A oa ) — e. The L-invariance of 
the law of the path P r implies that p(A(v)) > Pfc(^oo) — e for any v G Z 2 ; or equivalently, 
p(A 00 \A(v)) < e. The values of n, k, and e will remain fixed throughout the proof. 

Given the event A(v), we define v + and t>_ to be such that A(t>_) is the last band that 
the path Pr hits before the first time it hits A(v). Similarly, A(v + ) is the first band that 
the path Pr hits after the last time it hits A(v). 
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Figure 9.1: Possible illustration of T and A in dual perspective when k = 4 and n = 11: the 
shaded squares are elements of T, squares inside 4x4 boxes are elements of A, vertices on 
boundaries of these boxes are members of A. The infinite path Pp is the boundary between 
r and its complement. 
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Let C(v,w) be the event that some vertex incident to A(v) and some vertex incident 
to A(u>) are in the same connected component of T\A. In other words, C(v,w) is the 
event that there is a path in T that connects vertices incident to A(v ) and A(w) without 
passing through any other box A(x), x G" {v,w}. Given that A{w) occurs, it is easy to see 
that C(w,w + ) (respectively, C(w,W-)) must occur as well — we see this by taking a path 
comprised of vertices that lie immediately to one side of the portion of Pp that connects 
A(w) and A(w + ) (respectively, A(w_)). 

Given a non-self-intersecting path p = vi,...,v r in T\A connecting A(v) and A(u>), 
we define a continuous, non-self-intersecting path p from nv to nw by connecting the dots 
in the sequence nv, Vi, . . . , v r , nw with straight line segments. Since v\ G A(v)\A(v) and 
v r G A(w)\A(w) are initial and final points of p, it is not hard to see the the initial and 
final segments of p do not intersect any of the other segments. Given any continuous path 
q from nv to nw which is contained (except for its two endpoints) in M?\nZ 2 , we denote 
by C q (v,w) the event that there exists a path p in T\A, connecting A(t>) and A(w) for 
which p is homotopically equivalent to q (i.e., there exists a continuous Q : [0, l] 2 — > M? with 
Q(t,0) =p(t), Q(t,l) = q(t), q(0,t) = v, q(0,t) = w, for all t G [0,1] and Q(t,u) G M 2 \Z 2 
for all (t,u) G (0, 1) x [0, 1]). Denote by C^(v,w) the event that w = v + and the path P r 
from A(v) to A (it?) is homotopic to q. Define C~(v,w) analogously using v + instead of t>_. 
Clearly, C£(v,w) and C~(v,w) are both contained in the event C q (v,w). 

Let B(v) be the event that there exists a cycle in T\A which disconnects A(v) from 
infinity. We will ultimately prove Theorem 19 . 3 . II by showing, first, that certain combinations 
of events of the form C q (v,w) together imply, for topological reasons, the event B(0); we 
then bound the probabilities of these combinations of events using the FKG inequality and 
the shift-invariance of the law of Pr- This will enable us to prove that p(P(0)) > e, a 
contradiction. 

Our next step is to review some basic facts about the topology of the countably punctured 
plane. 

9.4.2 Homotopy classes of paths in countably punctured plane 

It is well known that the homotopy group of M 2 minus a discrete set of points is given by the 
free group generated by those points. (See, e.g., Section 3.5 and Exercise 3.4 of [HH]-) Instead 
of dealing with R 2 \Z 2 , however, it will be convenient for us to deal with the closed countably 
punctured plane defined as follows. First, consider an closed annulus of inner radius e and 
outer radius e/2. We can map this onto bijectively onto the space D' = P e \0U0 x S 1 , where 
D e is the closed disc of radius e and S 1 is the unit circle, by sending a point (r, 6) (defined 
in polar coordinates) to (2r — r , 9), if r ^ 0, and to x 9 if r = 0. We endow D' with the 
topology that makes D' and the annulus homeomorphic (when the annulus is endowed with 
the standard topology induced by the Euclidean metric). 

We define the closed countably punctured plane W = M 2 \Z 2 U Z 2 x S 1 analogously. It is 
homeomorphic to M 2 \[Z d + D] (where D is a closed disc of any radius less than 1/2), and 
it is not hard to see that it has the same homotopy group as M 2 \Z 2 . Intuitively, the closed 
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countably punctured plane is obtained by first poking a hole in M 2 at each lattice point and 
then inserting and infinitesimal rivet at that point; an advantage of that this space has over 
IR 2 \Z 2 is that (as we will see later — see Figure in the next section) every homotopy class 
has a minimal length representative. 

We will now describe the homotopy group of W more explicitly. Let a = (01,02) be an 
arbitrary point in IR 2 \Z 2 with irrational coordinates. For each i in Z 2 , let r x be the closed 
line segment from a to the point x (including its limit point x x arg(a — x)). Let u x be the 
portion of the same ray (from a through x to infinity) which lies between x and oo, together 
with its limit point x x arg(x — a). 

We can describe the homotopy classes of paths in R 2 \Z 2 which start and end at a in the 
following way. Let x be the homotopy class of a path which follows r x from a towards x, then 
makes a counterclockwise loop around x, and then returns to a along r x . Every homotopy 
class can be uniquely represented by a reduced word in elements of the form x, for ieZ 2 
(i.e., a finite-length word in the elements x and x~ l in which no element x appears next to 
its inverse x" 1 ). 

Now, let p : [0, 1] i— > M. 2 be any cycle in M 2 \Z 2 which starts and ends at a and which 
has only finitely many intersections with the rays u x ,x G Z 2 . Given p, we can generate the 
word corresponding to its homotopy class as follows. Let t vary between to 1. Each time 
p(t) crosses a ray u x in a counterclockwise direction, add x to the end of the word; each 
time it crosses a ray u x in a clockwise direction, add x~ x . The reader may easily verify (e.g., 
using induction on word length) that the fundamental group element produced in this way 
describes the homotopy class of p. 

Through the remainder of our discussion, if x G Z 2 , we will treat x as an element of W 
by using x as a shorthand for x x 0. For any x, y G Z 2 , let P x ^ y be the set of continuous paths 
from x to y in W . Let r' x denote the linear segment from the a to x x arg(a — x), followed 
by a counter-clockwise arc from x x arg(a — x) to x x 0. Note that the map p \— > {r'^^pr'y 
(using the standard concatenation definition of path multiplication; see, e.g., Chapter 2 of 
69 ) induces a one-to-one correspondence between homotopy classes of paths p from a to 
itself and homotopy classes of paths from x to y. (The inverse of this map is given by 
p ^ r'^r'yY 1 .) 

We can each view p G P x>y as a function from [0, 1] to W, defined up to a monotonic, 
continuous reparametrization of [0, 1]; by slight abuse of notation, we will sometimes also use 
p to denote the subset of W contained in the image of [0, 1] under this function. If p G P x , y 
for some x and y, let P p be the set of paths in P x ^ y which are homotopically equivalent to p 
in W. Let P be the union of P x ^ y over all disjoint pairs x, y G Z 2 . 

9.4.3 Minimal length paths 

For which sets of paths pi, . . . ,pk and points x G Z 2 is it the case that x lies in a bounded 
component of M 2 \ U^ =1 qi for every (qi,...,qk) £ EL>=i PpJ Roughly speaking, the answer is 
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that this is the case whenever the "taut" or "minimal length" paths 

k 

(pi,P2,...,Pfc) e \\P Pl 

i=l 

in these homotopy classes disconnect x from infinity in a certain "strong" sense. The purpose 
of this subsection is to make this statement precise. Some of the results in this section are 
related to algorithms in the computer science literature for finding minimal length paths of 
given homotopy classes in regions with polygonal obstacles (see, e.g., (23 f° r examples and 
additional references) and for testing equivalency between homotopy classes by computing 
the unique minimal length representatives of those classes. However, we have been unable 
to find exactly what we need in the literature, so we will give our own proofs of some of the 
basic facts (such as the existence of paths of minimal length) in our context. 

A line segment in W is an open line segment in R 2 \Z 2 together with its limit points 
(which may be in Z 2 x S 1 ). An arc in W is a closed path in x x S 1 (for some x G Z 2 ) 
which moves either strictly clockwise or strictly counterclockwise around S 1 . A piecewise 
linear path p in W is a path formed by concatenating finitely many line segments and arcs 
(where no two arcs appear sequentially in the concatenation); we will also assume that p is 
parametrized in such a way that it is not constant on any interval of [0, 1]. It also suits our 
purposes to assume that the endpoints of p are points of Z 2 . 

When p is piecewise linear, we write p' + for the right derivative and p'_ for the left deriva- 
tive of p (so that p'At) = p'_{t) whenever p is differentiable at t). If, in some subinterval inter- 
val of [0, 1], we have p(t) = (x, 9(t)), with x G Z 2 , then we write p'(t) = ■§^0(t)(— sin 9, cos9); 
we think of p'it) as a vector pointing "around the infinitesimal circle" in the direction that p is 
moving; we define p' + (t) and p'_ (i) on such subintervals accordingly. If p(0) = (xq, 9q) , then we 
write p'_(0) = (— cos^o, — sin^o), and if p(l) = (x\, 9\), we write p' + (l) = (— cos#i, — sin#i). 
(Informally, we think of p as "emerging from inside the hole at Xq" when t = and "turning 
inwards into the hole at yd' when t = 1.) Now, whenever p is piecewise linear, p' + and p'_ 
are defined throughout the interval [0,1]. Unless otherwise stated, we will always assume 
that a piecewise linear p is parametrized according to Euclidean length/arc length (so that 
\p'_\ and \p' + \ are both constant). Also, we will assume that p has no ?7-turns (i.e., points t 
at which p'_(t) = p' + (t).) 

If p is piecewise linear, a free corner of p is a point t G [0, 1] for which p(t) G M 2 \Z 2 and 
at which the path p changes directions; we refer to p(t) as the position of the corner. We 
also refer to each connected component of p~ l \L 2 x S 1 ] as a loop corner of p; the length of 
a loop corner is the length of the corresponding arc; the position is the corresponding point 
x G Z 2 . We say p is taut if it contains no free corners and the length of every loop corner is 
at least ir. The length of a piecewise linear path in W is the sum of the Euclidean lengths of 
its line segments. We now prove the following: 

Lemma 9.4.1 In every homotopy class P p C P x>y there exists exactly one taut path p. This 
p has minimal length among all paths in P p . 
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Figure 9.2: A path p (left) and the taut version p (right). The infinitesimal "rivets" are of 
W are shown as small circles on the left, slightly larger circles on the right. Rays r(x) and 
u(x) are depicted on the left. 

Proof Although this lemma may seem obvious intuitively, it will take us a fair amount of 
space to prove it. Suppose that q\ and q 2 are two distinct homotopically equivalent taut 
paths in P p C P x ,y Choose a to be a generic point with irrational coordinates that lies 
vertically below the entire paths qi and q 2 — i.e., the second coordinate of a is less than the 
smallest value x 2 which occurs in a point {x\,x 2 ) in either of these paths; in particular, this 
allows us to assume that whenever ieZ 2 and x occurs in the word corresponding to P p , x 
is higher than (i.e., has higher second coordinate than) the point a in the plane M 2 . 

Now, for qi, we can form a word as follows. Let X be the finite set of points x for which 
one of the paths q^ either crosses u x at some point or contains a loop corner at position x. 
Order the points X\, x 2 , ■ ■ ■ ,x m in order of the arguments of (x — a). Let t vary between 
to 1. Each time qi(t) crosses a ray u x , with x G X, in a counterclockwise direction, add x 
to the end of the word; each time it crosses such a ray u x in a clockwise direction, add x^ 1 . 
Similarly, each time it cross r x , with x G X, counterclockwise, add x to the end of the word; 
each time it crosses such an r x clockwise, add x^ 1 to the end of the word. Denote this word 
by w\ and the analogously defined word for q 2 by w 2 . (Since a is generic, and each of q\ and 
q 2 contains only finitely many arcs and line segments, it is not hard to see that that the set 
of t at which one of the qi(t) crosses a given u x or r x is finite, and that at each such t, the 
path crosses the ray transversely — either clockwise or counterclockwise — so that the above 
construction is well defined.) 

The rays r Xi U u Xi separate W into wedge shaped open pieces; for 1 < i < m — 1, denote 
by Wi the piece between r Xi U u Xi and r Xi+1 U u Xi+1 . Denote by Wo = W m the complement of 
the closures of the Wi for 1 < i < m — 1. Note that we write down a symbol Xi or Xi each 
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time qi passes from Wi to W i+i and x~ x or x~ l each time q^ passes from W i+ \ to W^. Now, 
we observe that if q\ is taut, no two symbols of the form x~i and x^ 1 (or, similarly, and 
x^ 1 ) can occur in sequence. If they did, then there would be some t\ and t<i with £2)) 
contained in Wi while q\{t\) and ^1(^2) both belong to the same member of {r Xi ,u Xi }. And 
it is easy to see that at some point in this interval (^,£2) (e.g., the first point at which the 
argument of q\ achieves its maximum) q x must have either a loop corner at x i+ i (with length 
less than n) or a free corner. A similar argument implies that the opposite sequences x~ l Xi 
and x^Xi cannot occur in either w\ or W2- 

Since symbols occur when q t passes from one wedge to another, it follows that whenever 
a positive symbol (x~i or Xi) and an inverse of a symbol (of form x~ l or x~ r ) occur next to 
each other in the word, then they have the same index i and one is a "hat" and one is a 
"bar" symbol. Also, it is easy to see that if two positive symbols occur successively in wi, 
and the index of the first is i, then the index of the second is % + 1. If two negative symbols 
occur successively and the index of the first is i, then the index of the second is i — 1. 

We know that the elements of the form Xi and a; -1 that appear in the word are determined 
by the homotopy class of p. From the above paragraph, it is clear that x~i and its inverse x~ x 
cannot appear in the word with only "hat" elements separating them; otherwise, a hat symbol 
would have to occur next to its inverse. It follows that, if the "hat" symbols are omitted, 
then the remaining "bar" symbols give a reduced form expression of the fundamental group 
word of p. Also, in between a pair of "bar" symbols, the hat symbol sequence is completely 
determined by the rules of the above paragraph. Thus, w\ = W2- 

Now, let fi(i) be points where qi first intersects the ray that corresponds to the ith 
element of the word. Since q\ is taut, each fi(i) is either a point x in Z 2 — at which a loop 
corner of arc length at least 71 occurs — or at a point in the interior of r x or u x . Define f'2 
accordingly. It is not hard to see that each pair fi(i) and fi(i + 1) must be connected by 
a straight line segment and/or arc of q\ (since the pair is connected by a piecewise linear 
path which intersects no ray transversely and has no corners); it follows that if j\ = fi-, 
then qi and q<i must be equal. In fact, if fi(i) G Z 2 , then its argument is determined by the 
word ordering; thus, fi(i) is determined by the value g\{i) = \ f\{i) — a\. So it is enough for 
us to prove that gi{i) = #2(2) for all i. Suppose otherwise, and let i be a value for which 
g±(i) —#2(2) is maximal. Suppose that fi(a) and /2(a) he along the ray through Xj. If neither 
lies at the point Xj, then a corner cannot occur at either one of them, and a simple geometric 
argument shows that at least one of the values g\ (i — 1 ) — g<i (i — 1 ) and g\ {% + 1) — gi {% + 1) 
is greater than g\{i) — gi^S)- On the other hand, suppose that only fi(i) lies at the point Xj 
and f2(i) does not. Assume for simplicity that f\{i) and f2(i) lie on u x (the case when they 
lie on r x is similar). Then we see, first, since /^(O is part of a straight line segment of q2 
from the ray through Xj-i to the ray through Xj+i, the values + 1) and f2(i — 1) will lie 
on this pair of rays, as will + 1) and f±(i — 1) (since the words are equivalent). It follows 
that the corner that occurs in q 1 at fi(i) has angle between ix and 2n (if it were greater 
than 27r, then the path would have to intersect r x before proceeding one of the other rays). 
Again, a simple geometric argument implies at least one of the values g±(i — 1) — #2(2 — 1) 
and gi(i + 1) — g 2 (i + 1) is greater than gi(i) — ^(O- We have now proved that the class 
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Pp C P XyV contains at most one taut path p. 

It remains to prove that P x>y contains at least one taut path p and that this path has 
minimal length. Given a word w, the above arguments determine the order in which any taut 
path in the corresponding homotopy class from x to y would have to intersect r Xi and u Xi 
(and the direction — clockwise or counterclockwise) for each such value). Let P w be the set 
of paths from x to y which indeed intersect the rays in the given order; as seen above, such 
paths are completely determined by the function Since length is a continuous function 
of the gi, it is clear that P w contains an element p for which the length is minimal. It is not 
hard to see that if p failed to be taut, we could decrease its length by moving one of the g^. 

How do we see that p has minimal length among all paths? Arguments similar to those 
above imply that when looking for minimal length paths, we may restrict our attention to 
paths p which induce the same word w as p (as described above). (If this is not the case, 
then a portion of the path will exit and enter one of the wedges Wi along the same ray; and 
thus the path can be shortened by pulling that portion taut.) It is also not hard to see that 
we may assume p is piecewise linear (since otherwise, by "straightening" segments and arcs 
of p, we could produce a piecewise linear p 1 G P p with length less than or equal to that of 
p). The arguments above then imply that p has minimal length among paths of this form; 
hence it has minimal length over all all paths in P p . I 

If r is a path, each of whose endpoints is an endpoint of either q or p, then we say that p 
and q have a crossing of type r if there is a path r' G P r for which the image of r' in W lies 
in the union of the images of p and q. Now define a metric on the set of paths in a particular 
homotopy class: S(p, q) = inf (sup t6 r 01 i \p'(t) — q'(t)\), where \x\ denotes the Euclidean norm 
of x and the infimum runs over all parametrizations p' and q' of the paths p and q. We say 
two paths p\ and p 2 are equivalent if 5(p 1 ,p 2 ) = 0; so 5 is actually a metric on equivalence 
classes, not paths. Denote by B 1 {p) the ball of radius 7 about p in this metric. We say 
that p and q have an essential crossing of type r if for some sufficiently small 7, p' and q' 
have a crossing of type r whenever (p', q') G (P p fl -B 7 (p), P q H _B 7 (g)). See Figure I^Ol The 
usefulness of these concepts stems from the following lemma. 

Lemma 9.4.2 If pi and p 2 are piecewise linear paths and pi and p 2 have an essential cross- 
ing of type r, then p\ and p 2 also have an essential crossing of type r. 

Proof From the definition of essential crossings, it is clear that the set C r of path pairs 
(<?i> I2) £ P Pl x P P2 with no essential crossings of type r is closed with respect to the product 
topology generated by 5 on P Pl x P P2 . It is also not hard to see that the set of elements 
in P pi x P P2 with combined length less than L, for some L 6 t, is compact; in particular, 
the length function is lower semi-continuous. It follows that the length function achieves its 
minimum over C r on some pair: we may assume this pair is (pi,p 2 )- 

Now, we claim that p\ and p 2 are both taut. Suppose otherwise, and that without loss of 
generality pi is not taut. Then there either exists an s G (0, 1) for which p\{s) = x G R 2 \Z 2 
and p\ fails to be linear on a neighborhood of s, or there exists an s G [0, 1] for which 
Pi(s) = x G 1? x S 1 and the angle of the arc at p\ is less than 2n. 
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Figure 9.3: The taut versions of the upper two paths have points of intersection but do not 
have an essential crossing; the taut versions of the lower two paths have an essential crossing. 

Now, let D be a small closed disc centered at x; we may assume that D contains no 
elements of Z 2 (except x if x G Z 2 ) and that its radius is generic — so that p\ and P2 each 
intersect the boundary of D at only finitely many points. 

Let qi and qi be obtained from p\ and P2 by "pulling taut" the portions of p\ and P2 
inside D, i.e., replacing them with the minimal length piecewise linear paths with the same 
endpoints on D and in the same homotopy classes. Clearly, q\ and q2 are shorter than p\ 
and p2- We will be done if we can show that q\ and q2 (or some q[ and q' 2 whose length can 
be made arbitrarily close to the length of q\ and #2) have no essential crossing of type r. 

Now, by assumption, there exist p[ and p' 2 arbitrarily close to {p\,P2) which have no 
crossing of type r. Given such p[ and p' 2 , we claim that we can "almost pull taut" the 
portions of p\ and p' 2 inside of D in such a way that no crossing occurs of type r occurs. 
First let us deal with the case that x is not a point in Z 2 ; in this case, we simply pull p[ and 
p' 2 taut in D to produce q[ and q' 2 . Now observe that a pair of segments in q[ and q' 2 will 
intersect one another if and only if the positions at which the segments exit D ordered in such 
a way that the endpoints of one segment divide the circle into two pieces, one containing each 
of the endpoints of the other segment (by adjusting p[ and p' 2 slightly if necessary, we may 
assume that their points of intersection with the boundary dD occur at distinct locations). 

The famous Jordan curve theorem states that any continuous simple closed curve in the 
plane separates the plane into two disjoint regions, the inside and the outside. A simple 
corollary is that if distinct points a, b, c, d are in cyclic order around a disc, and p is a 
continuous path in the disc from a to c, and q is a continuous path from b to d, then p and q 
must intersect. It is not hard to see that if q[ and q' 2 have a crossing of type r, p[ and p' 2 will 
have a crossing of the same type. It follows by assumption that q[ and q' 2 have no crossing 
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Figure 9.4: Five paths pulled "almost taut." Each pair of paths with a point of intersection 
on the right side must have an analogous point of intersection on the left side. 

of type r. Finally, it is not hard to see (by choosing p[ and p 2 close enough to p\ and p^) 
that we can arrange for q[ and q 2 to be arbitrarily close, in the 5 metric, to q\ and q 2 ; hence, 
qi and q 2 have no essential crossing of type r. 

The above argument shows that if p\ and P2 are minimal length paths with no essential r 
crossing, then p\ and P2 must be taut in a neighborhood of any point x G M 2 \Z 2 . If x G Z 2 , 
and either of p\ or p 2 fails to be taut at x (i.e., has an arc whose length is less than ir) 
then we can apply a similar argument. In this case, to pull p[ and p 2 "taut" we replace 
the segments of these paths passing through D with the minimal length paths in the same 
homotopy class; a path of this type will either be a straight line segment connecting two 
points on D or a straight line segment from the boundary of D followed by an arc of length 
at least tt and another straight line segment out to the outer boundary of D. 

Let A be the set of all the arcs that occur in the paths of this form produced from the 
segments of p\ and q[. We can partially order these arcs via inclusion; it is well-known that 
every partial ordering has an extension to a total ordering, and hence, we can replace each 
infinitesimal arc with an arc with the same angle and small positive radius — and choose the 
radii in such a way that they are decreasing with respect to the total ordering. See Figure 
19.41 Adjusting p\ and p 2 slightly if necessary, we may assume that no two of these arcs share 
an endpoint. 

It is clear that the paths defined in this way can be made to have arbitrarily small radius. 
Two such paths will cross one another if and only if the corresponding arcs a\ and a 2 have the 
property that a\ is neither a subset of 02 nor a subset of its complement. Another corollary 
of the Jordan Curve Theorem is that if 9\ < 9 2 < 63 < 64 are angles and D has radius R, 
and p is a path in D\{x} from (R, 61) to (R, #3) (where the angles, elements of R, may be 
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viewed as determining points in the universal cover of the annulus — so that the amount of 
winding of p is given by 9 3 — 61) and q is an analogously defined path from (R, 6 2 ) to (R, 6*4), 
then p and q must cross (and in fact, must have a crossing of the same type). As before, 
it is not hard to see that we can arrange for q[ and q' 2 to be arbitrarily close to q\ and q 2 \ 
hence q± and q 2 have no essential crossing of type r. I 

We say that a collection of piecewise linear paths pi, . . . ,pk separates x from y if there 
exists no path from x to y which does not cross at least one of the p^. We say that pi, . . . ,pk 
essentially separate x from y if there exists no piecewise linear path from x to y which does 
not have an essential crossing with at least one of the p^. 

We say that Pi, ■ ■ ■ ,Pk (essentially) bound x away from infinity if for all but finitely many 
y, pi, . . . ,pk (essentially) separates x from y. 

Lemma 9.4.3 Ifpi,p 2 , ■ ■ ■ ,Pk essentially separate x from infinity, thenpi, ■ ■ ■ ,Pk essentially 
separate x from infinity. In particular, p±, . . . ,pk separate x from infinity. 

9.4.4 Completing proof of Theorem 19.3.11 

Now, we return to the terminology of Section 19.4.11 We have already defined an extremely 
small constant, e. We will also need 5 (a very small constant) and 7 (a quite small constant) 
and /3 (a small constant). The exact values are unimportant. The following will comfortably 
suffice for our purposes: 

e = 10- 1000 V(^oo) 10000 

s = io- 100 V(^oc) 1000 
7 = io- 10 V(4, uo ° 



-io,. Moo) io 



(3 = Kr 10 /i(^c 

Recall that our aim is to prove that p(B(0)) > e, thereby deriving a contradiction. 
We call w a 5-pref erred direction if the coordinates of w are relatively prime to one another 
and p(C Pv (v, v + w)) > 5 for all v G 1? where p v is a straight path connecting v to v + w. 

Lemma 9.4.4 If there exist two distinct 5-preferred directions W\ and w 2 (with w% 7^ —w 2 ), 
then p(B(0)) > e. 

Proof The FKG inequality implies that with probability at least 5 8 , the events C Pi (oj, a i+ i) 
occur for each 1 < i < 8, where the values are given by v, v + w, w, —v + w, —v, —v — 
w, —w,v — w,v (so ag = cLi), successively, and each pi is the straight path from a« to a i+ \. 
In this case, there are eight paths — call them 91,..., contained in T\A — with each 
connecting A(oj) to A(a i+1 ) in a way homotopically equivalent pj. 
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We claim that if each of these events occurs and A(a.j) occurs for each i, then B(0) must 
also occur. To see this, first, say a vertex bA(v) D T is exposed if there exists a path in T\A 
connecting b to some b' G A(w), for some w ^ v. Then observe that if a given A(v) occurs, 
then any two exposed points in A(v ) can be connected by a "short" path in A(v) — i.e., a 
path which contains only points that are closer to A(v) than to any other A(w). If v ^ 0, 
then such a path is homotopically retractable in M 2 \{(0, 0)} to a straight line (since v ^ 0). 
By concatenating these short paths with the t^'s, we produce a cycle in A(a i+ i) in T\A which 
is homotopically equivalent in M 2 \{(0, 0)} to the concatenation of the PiS (which surrounds 
0). It follows that B(0) must occur. 

Given the events C Pi (a,i, eij+i) — which imply the event — the A(a,j) fail to occur with 
p probability at most 8e. Thus, p(B(0)) > 5 8 — 8e > e. I 

Lemma 9.4.5 There exists at least one ^-preferred direction. 

Proof We will assume that there is no 7-preferred direction, and attempt to derive a contra- 
diction. Now, let u(v) be the angle of the direction of the first line segment in the taut path 
with the same homotopy class as the path Pp assumes between v and v + (this is well-defined 
given the event A{y)). Given two distinct angles (0i,9 2 ), let C^g lt e 2 )(v) be the event that 
C p (v,w) occurs for some path p which, when pulled taut, leaves v in a direction which lies 
strictly in the interval (6*1,6*2) (on the counterclockwise side of 9\). 

Now suppose that with probability at least /3, u(0) lies in (0,7r/2) and with proba- 
bility at least (3 it lies in (7r/2,7r); by shift-invariance of the law of Pp, this implies the 
same is true of u(v) for any v. This also implies that for each v, the increasing events 
C(o,tt/2) (v) and C( 7T / 2j7T )(v) occur with probability at least (3. Hence their union occurs with 
probability at least j3 2 . Now, we claim that Co )7r /2((0, 0)) and C 7r /2, 7r ((l, 0)) together imply 
C Po ((0, 0), (1, 1)) where po is the straight path from (0,0) to (1,0). To see this, first make 
the geometric observation that any taut path from (0,0) with starting direction in (0, 7r/2) 
and a taut path from (1,0) with starting direction in (tt/2,ti) must intersect; in fact, the 
first linear segments of these two paths must cross transversely at some point in the set 
{(x\,x 2 ) '■ < X\ < 1}, forming an essential crossing of type po- Lemma 19.4.21 then implies 
that C Po ((0, 0), (1, 1)) follows from Co )7r /2((0, 0)) and C 7T / 2t7T ((l, 0)). Thus, in this case, (1,0) 
is a (3 2 > 7-preferred direction. Using similar arguments for (0, 1), we conclude that no two 
consecutive quadrants can each contain u(0) with probability (3. 

The same argument applies if we replace (1, 0) and (0, 1) with any pair of generators for 
the lattice Z (i.e., any pair of vectors w\ and w 2 , whose integer span is Z 2 , or equivalently, 
any W\ and w 2 , each of which contains a relatively prime pair of coordinates, for which the 
parallelogram with sides determined by Wi and w 2 has unit area). In this case, we cannot 
have u(0) contained in two consecutive quadrants of the form (arg(±t>), arg(±io)) each with 
(3 probability. This also implies that at least one pair of opposite quadrants has combined 
probability less than 2(3. 

Now, suppose that u(0) is equal to n/2 with probability (3. When this occurs, the taut 
version p of the section of Pp between (0, 0) and (0, 0) + will start in the (0, 1) direction and 
pass the points (0, 1), (0, 2), (0, 3), . . . (on either the left or right, with arcs of length n) up 
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Figure 9.5: If the bottom vertex is the origin, then the first path shown passes two vertices 
(namely, (0, 1) and (0, 2) on the left). The other two paths each pass one vertex on the left. 

to some vertex at which it either turns to the left or right or ends. Assume without loss of 
generality that with probability at least (3/2, p does not pass the first vertex on the right 
with angle 7T. Conditioned on this event, let i denote the first (0,i) in the sequence that 
p does not pass on the left (with angle 7r or greater — see Figure l93j) . This i is a random 
variable; let io be its median value (conditioned on u(0) and on p not passing the first vertex 
on the right with angle 7r). Now, let C^(v) be the event that there is some path p in T\A 
between A(v) and some A(w), which, when pulled taut starts out by moving in the (0, 1) 
direction (as before, not passing first vertex on the right at angle it) and passes at least io 
vertices on the left before it stops or turns; let C^(v) be defined analogous except that the 
taut path passes at most i vertices on the left before it stops or turns. Each of these events 
has probability at least /3/4. Now, we claim that C^((0, 0)) and C~(0, 1) together imply 
C Po ((0, 0), (0, 1)), where p Q is the straight path from (0, 0) to (0, 1). To see this, as before, by 
Lemma 19. 4. 21 it is enough to let pi and P2 be the paths whose existence is guaranteed by the 
events C^((0,0)) and C^(0, 1) and to show that pi and pi must have an essential crossing 
of type p . (We leave these details to the reader.) It follows that (0,1) is a (/5/4) 2 > 7- 
preferred direction. We conclude that w(0) cannot be equal to (0, 1) with probability j3\ a 
similar argument implies that u(0) cannot be equal to any single vector v with probability 



Now, with probability Aoo, u(0) has some direction. Thus, for any pair of generators 
(v,w), since each of ±t> and ±w has probability less than (3, and one pair of opposite 
quadrants has combined probability less than 2/3, the other pair of opposite quadrants must 
have combined probability equal to at least Aoo — 6(3. Assume without loss of generality 
that w(0) lies in the intervals (argw, aigw) and (arg(— v), arg(— w)) with probability at least 
Aoo — 6(3. Equivalently, tt'(0) G (argt>, argw) where u'(0) is u(0) modulo n. 

Now, we can replace v, w with either the set of generators (v,v + w) or (v + w, w). Since u 
is equal to ±(v + w) with probability at most (3, we have that u'(0) will lie in one of the two 
intervals (argf , arg(t> + w)) and (arg(t> + w), argu>) with probability at least (A^ — 7(3)/2. 
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Assume without loss of generality that this is the first interval. Then for the set of generators 
(v, v + w), the quadrant between v and v + w and its opposite must be the highly probable 
ones; that is, we must have u'(0) e (argt> , arg(t> + w)) with probability at least — 6/3. 

We can repeat this process, each time sending a generating pair (v, w) to either (v, v + w) 
or (v + w,w). Now, assume (changing bases if necessary to make this the case) that the 
initial pair of vectors was v = (1,0) and w = (0,1). Then with each modification, the 
coordinates of v and w remain positive but the sum of the coordinates increases. It follows 
that the parallelograms defined by v and w has its opposite corner grow progressively longer 
(i.e., v + w increases in norm) and skinnier (since the parallelograms always have area one) 
with each iteration, and at each step, the probability that u(0) belongs to the narrow angle 
defined by the parallelogram is at least — 6(3. However, this cannot be true for a nested 
sequence of arbitrarily small angles, because u(v) has no sufficiently large point masses (i.e., 
it achieves no single value with probability more than j3), so this is a contradiction. I 

To complete the proof, suppose that there exists one 7-preferred direction v and no other 
direction which is 5-preferred. Assume for now that v = (1,0). Now, replace A with A' 
obtained by removing every second row from A; that is, write A'((x, y)) = n(x, 2y) + Ak and 
A' = U ve z 2 A(f). Now, observe that, if we redefine preferred directions as before but using 
A' instead of A, then although (0, 1) may no longer be a 7-preferred direction, it is still a 
7 2 — e-preferred direction (in particular, it is a 5-preferred direction). To see this, observe 
that if there are paths homotopic to the straight ones in T\A from A((0, 0)) to A((l, 0)) and 
from A((1,0)) to A((2,0)), and A((1,0)) occurs, then there must be a path from A((0, 0)) 
to A((2, 0)) which is homotopic to the straight one. It is not hard to see that Lemma [9.4.41 
and Lemma [9.4.51 still apply to the modified system produced by replacing A with A'. 

Now, we repeat this process of switching A with A'; if at some point (1, 0) ceased to be 
a 7-preferred direction then, as we have observed, it would still be a (^-preferred direction. 
By Lemma 19.4.51 there would have to be another direction which was 7-preferred, and by 
Lemma [9.4.41 this would imply p(B(0)) > e. 

On the other hand, it is not hard to see that if we repeat this process for m steps (so 
that A' contains every 2 m th column of A), then as m gets large, the probably that + lies on 
the vertical coordinate axis will tend to A^. By the argument used in the proof of Lemma 
19.4. 5[ this implies that for m large enough, (1,0) will be a 5-preferred direction, and again, 
Lemma f9 .4.41 will imply that p(B(0)) > e. If v is not equal to (1, 0), we can change the basis 
for the integer lattice indexing the boxes of A so that it is equal to (1, 0) and apply the same 
argument as the one above to show that p(B(0)) > e. 

This concludes our proof by contradiction of Theorem 19.3.11 

9.4.5 A corollary 

The following easy corollary of Theorem 19.3. II mav be of independent interest: 

Corollary 9.4.6 There exists no L-invariant Gibbs measure p on the space of subsets of 7? 
which possesses the following properties: 
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1. p satisfies the FKG inequality. 

2. With p probability one, V and T c each have a single infinite connected component. 

Proof Let V be the union of V and all of the finite components of the complement of V. 
If T is a random variable whose law satisfies the conditions of Corollary 19.4.61 then V is a 
random variable whose law satisfies the conditions of Theorem 19.3.11 I 

9.5 Ergodic gradient Gibbs measures of slope u G dU§ 

We have now described the L-ergodic Gibbs measures with slopes in Uq>. The ergodic Gibbs 
measures with slopes in dUq> are much easier to describe. Let X be a single closed edge of 
the polygon dU$. 

We say x and y lie in the same frozen band of Z 2 if for every p 6 £/$ with slope in X, 
the value <fi(y) — <f)(x) is almost surely constant. By Lemma 14.3.61 there exists an infinite 
sequence of parallel "frozen bands" (call them hi) which are subsets of Z 2 . We say a function 
cf) is X -direction taut if the height differences of on every such b,i are precisely these values. 

Let ai be the first vertex on the horizontal coordinate axis of Z 2 which intersects 5j. 
(If <2j is not defined for every i — because the bands bi are actually parallel to the horizontal 
axis — then replace the horizonal coordinate axis with the vertical axis in this definition.) Let 
Aj be the set of possible differences <fi(ai) — 0(a,j_i) for functions <fi which have the defined 
differences on the frozen bands. (Note that Aj may be all of Z.) Let kx be the smallest 
integer for which there exists a v in £ for which 6 v bo = bk x - 

Let T be the set of functions / : Z i— > Z for which f(i) 6 Aj for all % G Z. It is now easy 
to verify the following: 

Theorem 9.5.1 The set of extremal Gibbs measures p on Z 2 for which <fi is p-almost surely 
X-direction taut is in one-to-one correspondence with elements ofY. The set of L- ergodic 
Gibbs measures p on Z 2 with slope in X is in one-to-one correspondence with measures on 
T with finite expectations, and which are ergodic under translations by kx%- 
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Chapter 10 
Open Problems 



10.1 Universality when m = 1 and d = 2 

Several of the most intriguing questions about random surfaces arise in the case that m = 1 
and d = 2 and $ is a simply attractive potential. 

10.1.1 Infinite differentiability of a away from roughening transi- 
tion slopes 

In the case of periodically weighted domino tilings, the surface tension a is infinitely differ- 
entiate away from the slopes in the dual of L jM]. We conjecture that this is the case for 
general discrete simply attractive potentials when m = 1 and d = 2. We further conjecture 
that the smooth phases with slopes in U<$> occur at precisely those slopes at which a has a 
cusp (as in the dimer model case |64j). 

10.1.2 Central limit theorems: convergence to Gaussian free field 

Kenyon recently proved that with certain kinds of boundary conditions, random domino 
tiling height functions have a scaling limit (when the lattice spacing tends to zero) which is 
the "massless free field," a conformally invariant Gaussian process whose coefficients in the 
eigenbasis of the Dirichlet Laplacian are independent Gaussians [U3|- Naddaf and Spencer 
proved a similar result for Ginzburg-Landau V0-interface models [73]. We conjecture that 
a similar result holds for general simply attractive models a in rough phase. 

10.1.3 Level set scaling limits 

If a height function defined on J? is interpolated to a function which is continuous and 
piecewise linear on simplices, then the level sets C a , given by <fi (a), for a6l, are unions 
of disjoint cycles. What do the typical "large" cycles look like when $ is simply attractive 
and (f) is sampled from a rough gradient phase? The answer is given in [H2| i n the simplest 
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case of quadratic nearest neighbor potentials — in this case, the "scaling limit" of the loops, 
as the mesh size gets finer, is well defined, and the limiting loops look locally like a variant 
of the Schramm Loewner evolution with parameter k = 4. We conjecture that this limit 
is universal — i.e., that the level sets have the same limiting law for all simply attractive 
potentials in a rough phase. 

10.1.4 Strong uniqueness 

We proved that when m = 1 and d = 2, E = Z d , and $ is Lipschitz, simply attractive, and 
£-ergodic, then the £-ergodic gradient Gibbs measure of slope u G U$ is unique. However, 
we did not address the existence of non-£-ergodic gradient Gibbs measures of slope u. 

To be precise, we say a non-£-invariant gradient Gibbs measure [i G 3 fT ) has ap- 
proximate slope u if for any e, the probability that [4>(x) —0(0)] — (u, x) > en for some x G A n 
tends to zero as n tends to oo. 

Conjecture 10.1.1 For each u G U$, there exists only one gradient Gibbs measure with 
approximate slope u. 

Such a result would be analogous to the two-dimensional Ising model result which says 
that there exist no non-translation-invariant Gibbs measures (see jHj for a new proof of this 
fact and a history of the problem); it is possible that techniques similar to those of jH] will 
be useful in this context as well. 

10.2 Universality when m = 1 and d > 3 

10.2.1 Central limit theorems: convergence (after rescaling) to 
Gaussian free field 

Using different scalings (which probably only make sense when the spin space is continuous) , 
Naddaff and Spencer extended their central limit theorems to higher dimensions in the 
Ginzburg-Landau setting. In what situations do similar results hold for perturbed simply 
attractive potentials? 

10.2.2 Existence of rough phases 

We conjecture that there are no rough phases for simply attractive potentials when d > 3, 
m — 1, and E — R; this is known to be the case for Ginzburg-Landau models (see, e.g., JIU] 
for more references) but it is not known in general. We also conjecture (although this may 
be riskier) that there are no rough phases for any simply attractive potentials when d > 3 
and E = Z. 
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10.3 Refinements of results proved here 



10.3.1 Fully anisotropic potentials 

The large deviations principle and variational principle results in Chapters El and [7| were 
proved for isotropic potentials and for Lipschitz potentials in the discrete setting. We sus- 
pect that the large deviations principle and variational principle theorems have analogs for 
perturbed anisotropic (i.e., not necessarily isotropic) simply attractive potentials. In partic- 
ular: 

Conjecture 10.3.1 The variational principle (Theorem W. 3.1)) applies to all perturbed sim- 
ply attractive potentials. 

For the purposes of deriving the strongest possible anisotropic general large deviations 
principles, it may be useful to use the anisotropic analogs of the Orlicz-Sobolev space results 
we used in Chapter 03 — several of these analogous results are proved in [T7]. It would also be 
nice to have a proof of the variational principle that does not rely as heavily on analysis as 
the proof we presented here for the perturbed isotropic case. 

10.3.2 Measures of infinite specific free energy 

In our version of the variational principle, we showed that for every ergodic ^-invariant 
gradient Gibbs measure \i with slope u G t/$, SFE(fi) = a(fi) whenever SFE([i) is finite. 
Say an £-ergodic gradient Gibbs measure /i on 9 rT ) is non-trivial if /^-almost surely, 
Z A ((j)) < oo for all A CC Z d . 

Conjecture 10.3.2 For every u G U^, for every perturbed simply attractive potential, there 
exists no non-trivial (in the sense described above) gradient Gibbs measure for which S(fi) = 
u and SFE(fi) = oo. 

10.3.3 More general domains D 

There is a range of weaker Orlicz-Sobolev theorems that apply when weaker regularity con- 
ditions are placed on D (see, e.g., Remark 3.12 of [IE]); indeed, much of the literature 
on Orlicz-Sobolev spaces (see, e.g., the reference text jTTJ ) is focused on extending Orlicz- 
Sobolev bounds and embedding theorems to domains with strange boundaries. It is probably 
possible to use these more general results to prove weaker large deviations principles for ran- 
dom surfaces on appropriate mesh approximations of these more general domains. 

10.4 Substantially different potentials 
10.4.1 Strongly repulsive lattice particles 

We proved large deviations principles for classes of perturbed simply attractive potentials $. 
For what other kinds of nearest-neighbor potentials $ do similar large deviations principles 
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apply? Consider the case the m = d = 3; in this case, we might think of as describing the 
spatial position of atoms in a three-dimensional solid lattice. To take into account repulsive 
forces between atoms, let $ = $ x + $ 2 where $ x is simply attractive and $ 2 is "strongly 
repulsive" potential given by ^ 2 xy {ff) = V(rj) for every pair x,y G Z d , where V : R i— ► R is 
symmetric and satisfies lim^i^Q V{rf) = oo and lim^i^oo \r]\ d V(r]) = 0. In this before, 
we will define cr(u) to be the minimal specific free energy among ergodic Gibbs measures u 
of a given slope (here u is a 3 x 3 matrix). It is not hard to see that a is symmetric and 
a(u) = oo if and only if u = and that cr(w) tends to infinity as the determinant of u tends 
to 0; this latter fact is a frequently imposed condition in the study of continuous variational 
problems and partial differential equations. (See, for example, Section 9.2 of [77].) 

Clearly, the surface tension will not be convex for models of this form. But are the 
gradient phases unique? If not, is it possible to classify them or to prove a large deviations 
principle similar to the one proved in this text? Under some conditions, we might expect 
the gradient phases to be random perturbations of periodic lattice packings, so this problem 
may be related to sphere packing problems. 

10.4.2 Large deviations for more general tiling problems 

Domino tilings are in one-to-one correspondence with the finite-energy height functions <fi : 
Z 2 i— > Z for an appropriate potential Many other classes of tilings (e.g., ribbon tilings, 
tilings by 1 x a and 6x1 blocks, etc.) are in one-to-one correspondence with the finite- 
energy height functions $ : Z 2 t— > Z m , where m > 1 and $ is an appropriately chosen convex 
nearest-neighbor gradient potential. (See [80j or 20j.) However, none of the results in this 
text (variational principle, large deviations principle, Gibbs measure classification, etc.) is 
known for any non-trivial tiling problem in which the height function space has dimension 
m > 1. 

10.4.3 Non-nearest-neighbor interactions 

As observed in Section 18.81 the gradient phase uniqueness arguments of Chapter El fail to 
hold if we consider convex pair potentials which are not nearest neighbor potentials. What, 
in fact, can be said about the gradient phases in the non-nearest neighbor case? When d = 2 
and E = Z, is there a convex, finite range gradient potential for which the smooth gradient 
phases have arbitrarily many slopes (not merely slopes in the dual of £)? Is there a convex, 
infinite range potential for which there is a smooth gradient phase of every rational slope? 

10.4.4 General submodular potentials 

It is natural to wonder whether the cluster-swapping arguments used in this text really only 
apply for pair potentials. 

We say a potential $ is submodular if for every A CC Z d , $a has the property that 
$a(0i) + $a(0 2 ) > $A(min(0i, <fa)) + $ A (max(0i, 2 )). 
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When $ is a gradient pair potential, the property of submodularity is equivalent to the 
convexity of the potential functions V XjV . Is there some variant of the cluster swapping 
argument that applies to general finite-range, submodular potentials? 
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Appendix A 

SFE and the lexicographic past 



We present here an alternative proof of Theorem 13.3.41 which also leads to a representation 
of the specific free energy in terms of the entropy of \x on one period of £, conditioned on 
the lexicographic past. This approach is analogous to the approach used in Chapter 15 of 
|43j for non-gradient measures. We restate the theorem here for convenience. 

Theorem A.0.1 The function a : Q h-> R, defined by a(<f>) = SFE(irf), is 7 n 3 L - 
measurable, is bounded below, and satisfies 

SFE(fi) = fi(a) = fi(SFE(irl)) 

for all fi £ ^(fi,^- 

Corollary A. 0.2 If p, can be written 

/i = / vw^dv) 

then 

SFE(n) = [ SFE{p)w IJ ,{dv) = w^SFE). 

We would like to prove this by citing an analogous result proved for non-gradient mea- 
sures. First, we need some notation. Whenever ji £ !P(fi, 3") and A C Z d 1 write /xAa £ 
IP(fi, 3) to mean the independent product of fi Z d\ A (to determine <f){x) when x £" A) and A' A ' 
(to determine 0(x) for x £ A). (This is a finite measure if A is a finite measure.) Sometimes 
we will replace A A with /A A where / is an ^-measurable function, but the definition is the 
same. Next, given x, y £ Z d , write x -< y if x precedes y in the lexicographic ordering of Z d 
(i.e., Xj < yj where j = mi{i\xi ^ Vi})- For each y £ Z d , write T(y) = {y} U {x £ -< y} 
and r*(y) = {x £ Z d |x -< y}. Now, Proposition 15.16 of 33] states the following. (Since we 
intend apply this result to an "alternate" measure space (f2, 5F), and not the space (O, 3) we 
have been using throughout this text, we will use bars over these and other variables in the 
theorem statement to avoid confusion with our analogously defined "global" variables.) 
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Lemma A. 0.3 Let ~p, be a shift-invariant measure on (fl, 7) where fl = E , (E, £) is a com- 
pact standard Borel space with a finite underlying measure A, and 7, J, 7 are the corresponding 
Borel product, shift-invariant, and tail a -algebras. Define h(~p) = lim^oo \ A n \" 1 'K(j[,J[X\\ n \) . 
Then h is well-defined and 7 fl d -measurable. Moreover, for each y G Z d , 

h{p) = D< Fny) (jl,-p\ { y } ). 

In the setting of Lemma lA.0.31 h(jl) is called the specific entropy of ~p. In less formal 
terms, the lemma states that h(jl) is equal to the //-expected conditional entropy of the 
random variable (f>(y) (with respect to A) given the values <p(x) for {x\x -< y}. The next 
lemma, Proposition 15.20 of [13], is analogous to Theorem 13.3.41 

Lemma A. 0.4 //, in the setting of Lemma \A.0.3l Jt has a representation 

~P= __ vw-p(dV) 

where 7 is the Borel a-algebra of the product topology on fl, then 

h{~p) = JX(h(nl)) = h{V)w-p:{dV) = w-p(h). 

We will now derive Theorem 13.3.41 from these lemmas. Our first step will be to use \x to 
construct a new measure space and a related measure ~ft to which the above lemmas apply. 
Let A be a fundamental domain of £. Though our argument applies to any sublattice £ of 
Z d , we will assume, for notational simplicity, that £ = k7L d for some integer k > 1, so that 
A = [0, k — l] d . Let ei, . . . , erf be the standard basis vectors for Z d . Define E = E' A K Given 
<f) G fl, we define G fl by writing 

<p(x) = (<p(kx + ai) — (p{kx + a ), • • • , (f>(kx + am) — 4>(kx + a )) 

where {a^} is an enumeration of the points in A U {— e^}, with a = — e^. Note that if 
k — 1, then is simply a discrete derivative of in the direction. We define a measure 

A = exp yHjlu{-e d }) ^' A '- If ^ is a perturbed simply attractive model, then A is easily seen 

to be a finite measure; we can add oo to E to make it a compact standard Borel space (by 
the definition of [13]) with finite underlying measure. For later use, for 1 < j < d, define 
X 1 analogously to A by replacing e d with ej. Finally, let Jt be the law on (fl, 7) induced 
by fi and the map t— > 0. Now Jt and (fl, 7) satisfy the conditions in Lemma IA.0.31 and 
Lemma 13.3.41 Throughout this section, if Ai C Z d and A 2 C 1* d , we use the notation 
Ai + A 2 = {x + y\x G A l5 y G A 2 }. We define a modified potential as follows: 




A C (A U {—ed}) + x for some x G £ 
$a otherwise 
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We now claim that 

SFE(ji) = h(ji) 

where is the specific energy of [i with respect to defined by 

= lira |A„rV<- 

n— »oo 

Note that since both [i and $' are ^-invariant and $' has finite range, it is also possible 
to write //($') as the /i expectation of a single positive cylinder function (see Chapter 15 of 
43 ). We claim that if we can show that SFE(fi) = h(ji) + then this fact, along with 

Lemmas IA.0.31 and 13.3.41 will imply Theorem 13.3.41 To see this, observe that by Lemma 
13.2.51 and the definition of ergodic compositions at the beginning of this chapter, it follows 
that fi(F) = /i(7r^(F)) = J v{F)w^{dv) whenever F is the indicator function of a cylinder 
event; since a positive cylinder function can be written as a positive linear sum of countably 
many indicator functions of cylinder events, it easily follows that = /i(7r^($')) = 

J !/(<&>„(*/). 

It also follows trivially from definitions that \i = J vw^idv) implies fj, — J vw^dv). Thus, 
SFE{fi) = h(jl)+fi(&) = J h(V)w^du) + J v{&)w^{dv) = 

J SFE(u)w^(du) = Wfl {SFE). 

By Lemma rA.0.3| h(V) is 7 H J^-measurable as a function of v G ^(fi, 3F); in particular, 
it is ^-measurable, and the same is trivially true of v($>'). Thus by Lemma 13.2.51 and the 
definition of ergodic decompositions, we also have the rest of the theorem statement: 

SFE(im) = h{SFE{*1)). 

It now remains only to prove that SFE(fi) = h(jl) + //($'). We do this by checking two 

equalities, which we state separately as lemmas. First, define \i\ 3 y G as follows: 

sampling from that measure is equivalent to first sampling from [i and then re-sampling 
4>(x) for x G A + kj in such a way that the values {4>{kj + a,) — (f)(kj — e^-) 1 1 < % < |A|} obey 

the law of X 3 (where X 3 is as defined above). 

Lemma A.0.5 For each y G Z d , if SFE{fi) < oo, then 

SFE^) = X? lr{y)+A ^,vX)+^')- 
Moreover, if SFE(fi) = oo, then we still have 

5FS( A i) = M SSrw+A (A*,/*^)+/*($0 
for some 1 < j < d (in which case we may relabel the coordinates axes so that j = d). 
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Proof Let $" = $— Write = H\+H\ where the latter two terms are the components 
of coming from $' and $" respectively. If A C Z d , write A = kA + A. It follows from 
the definition of free energy that: 

lAnl" 1 ^^) = IKl^Xsr (//,ex P (-HI ) Al^l" 1 ) = 

A n \ / 

lA^l^^^exp (-^l) A 1 ^ 1 - 1 ) + |A n |-V(#i)- 

As n tends to infinity, the left hand side tends to SFE(fi) and the second term on the right 
hand side tends to So it is enough to check that: 

lim IKr 1 ^ (//,exp (-H* ) X'^ 1 ) = Jfe U,^)- 

We now prove this fact using an argument similar to the one in the proof of lATOl in Pj . 
Write atj = ^g^, (//, ^tA^). By Lemma 1^.1.31 is the expected conditional entropy of </>(?/) 
(with respect to A^) gwen the differences <p(xi) — <f)(x2) for all x±,X2 G r(y). We will now 
express IK^t. (//, exp (— iJi ) A' A,l ' _1 ) as a sum of IAJ expected conditional entropies, all of 

A n \ Jin/ 

which (except for a boundary set) are between zero and atd, and most of which are very close 
to a = ad- 

Now, define atA„(y) = -Kgr _ (/i, /xA ) where j = j y = sup{i\y - e; 6 A n }. (Sep- 

A ?l nr(i/) y 

arately define aA„(0) = "K^ , (/i, /iA' A ' -1 ).) In words, we can think of each a\ n (y) as 
an expected conditional entropy of an n d -dimensional random variable ip(y) (given by the 
4>(ky + i) — 4>(ky — e 3 - y )) with respect to X y given the values of ip(x) for x -< y and x G A n . 
(Note that i[>(Q) is only an n d - 1 dimensional variable.) Let H-^ be the sum of the en- 
ergies that give the Radon-Nikodym derivatives for these measures \° y : that is, H-^ n ) = 
H\ + ^2 yeAn y+ Q H[ y+ A]u{j y }- Now, repeated applications of Lemma 1^.1.31 to this sequence of 
expected conditional entropies yields 

yeA n 

Now, H-r and H\ differ only by the inclusion and exclusions of energy terms coming from 
the boundary of A n . Unless one of these expected energy terms is infinite (in which case it 
is clear that SFE(fi) = oo and a- 7 = oo for some j), £j -invariance implies that 

Hsjfoexp (-i^J A^l- 1 ) = a ^(y) + o(n d ). 

yeA„ 

If a? = oo for some j, then we will assume that the coordinate axes were labelled in such 
a way that j = d, in which case the above expression still holds. (We will satisfy the lemma 
statement in this case by showing that SFE(fi) = oo.) By Lemma ^.l.ll we have a.A n {y) < ot 
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whenever both y G A and y — G A (i.e., except for boundary terms). By Lemma \2. 1.21 for 

any e, there exists a finite subset A C T*(0) for which J£grr A (//, //A y ) > a — e. Then Lemma 
12. 1.1 1 implies that whenever a\ n (y) > a — e whenever y + A G A n . Letting n tend to infinity, 
we conclude that 



a: 



a-e< lim lAJ-^sr (//, exp ) A |A " hl ) < 

Since e was arbitrary, the expression is equal to a. I 
Lemma A. 0.6 h(ji) = ^ijz (/•*, 

Proof By our construction and Lemma lA.0.31 h(jl) = IK/^//, //A y ), where A y is the o- 
algebra on Q formed by pulling back 3 r r(y) via the map t— > 0; in other words, A y is the 
smallest a-algebra in which <p(xi) — <fi{x2) is measurable for every pair (a^a^) contained in 
A U {— ed} U A + kj for some y -< 0. Define f$(xi, X2) = 4>(%i) — 0(^2) for all such pairs. So 
now, we need only show that CKa^//, //A y ) = (/•*, The two er-algebras are slightly 

different; in the former, functions of <fi{xi) — <p{x2) are measurable only when X\,X2 G r(y) 
and Xi and X2 are in the same "row" — i.e., x\ = y± + zi and X2 = y2 + £2 where each Zj is in 
A and the yi G A;[r(?/)] are vectors that agree on all but the <ith coordinate. In particular, 
for each y = (yi,...,yd) £ To, define an "average row distance" function that is measurable 
with respect to but not A y : 

j 

9<p(y) = lim j' 1 Y] <t>(yi, 1/2, • • • , yd-i, i) ~ 0(°, 0, . . . , 0, i). 

i=i 

Since we may assume that // is shift invariant with finite slope (otherwise it is clear that 
SFE(n) = 00 and //($') = 00, so we must have SFE(fi) = h(jl) + //($') in that case), it 
follows from the ergodic theorem that is well-defined //-almost surely. Now, the lemma 
is equivalent to the statement that the expected conditional entropy of 0(0) given is the 
same as the expected conditional entropy given and g$. It suffices to show that given fy, 
the regular conditional probability distributions for the random variables 0(0) and g^ are 
almost surely independent. 

Suppose otherwise. Then there would be, with positive probability, some event A de- 
pending only on and some e > such that |//(A|/^, 0(0)) — n(A\fA\ > e (where here 
fi(A I*) denotes the regular conditional distribution — given * — integrated over A). By the 
(one-dimensional) ergodic theorem, this statement would also have to be true with positive 
probability for a positive fraction of the shifted functions 0j ed <f> with j G Z; but since the 
probability of A with respect to an the increasing sequence (in j) of subalgebras A y+ j ed is a 
martingale, this contradicts the martingale convergence theorem. I 
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Appendix B 
Summary of notations 



Vectors and lattices 

d dimension of the configuration lattice Z d 

E target space for random functions (usually M m or Z m ) 

m dimension of E 

e±, . . . , e<2 standard basis vectors in domain space Z d 

e 1 , . . . , e m standard basis vectors in range space E 

£ a-algebra on E 

A underlying measure (usually counting or Lebesgue) on (E, £) 

A, A subsets of Z d 

A CC Z d AcZ d and |A| < oo 

L rank-<i sublattice of Z d 

A n [0, kn — l] d C 7j d where k is chosen so that /cZ d C £ 

group of translations of Z d by elements of L 

r group of translations of E 

£j dual lattice of £ 

Configuration space and cr-algebras 

f2 configuration space, set of functions from Z d to E 

0, ip elements of ft, 

9 X (4>) translation of G f2 by x G Z d , i.e., (9 x <f>)(y) = <f>(x + y) 

smallest cr-algebra on f2 in which values on A are £-measurable 

7 a-algebra generated by 3a, A CC Z d 

7a 7 gd\A 

7 tail a-algebra, defined as n Ac cz d X\ 

5F r a-algebra of r-invariant elements of 7 

7\ 7 A n IF 

T A T A n J T 

T r Tn? T 
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Gibbs Potentials and Hamiltonians 



<3>, * Gibbs potentials; $ = {$ A : A CC Z d }, where each <3> A : Q — > RU {00} 

is measurable 
Hf((f)), H\((p) Hamiltonian in A, defined as X^AnA^0 ^a(0) 
H° K (4>) interior Hamiltonian in A, defined as ^aca^a(0) 

Zf((f)), ^a(0) partition function on A with boundary condition on Z d \A, i.e., 

/ UxeA d< i>( x ) ex P (-#a0)) 
7*, 7a transition kernel corresponding to a Gibbs rerandomization on A, i.e., 

7 f(A, 0) = Za^)- 1 / Il^A eX P (-^A(0)) 

free boundary partition function on A with respect to $, i.e., integral 
of e~ H A over entire space E^^ 1 of functions (defined up to additive 
constant) on A 
W(A) -\ogZ° A 

T>n space of positive L x r-invariant potentials for which W(e) is finite for 

every edge e 



Nearest-neighbor Gibbs Potentials 

V, V Xt y nearest-neighbor difference potential 

V wedge- normalization of V, defined as V{rj) — log g(F(i])) where g(rj) = 

2 -4|„-I| and F( V ) = fee-vM* 
i] nearest neighbor height difference (input to V) 

SAP simply attractive potential (a.k.a., convex nearest-neighbor, periodic 

difference potential) 
ISAP isotropic simply attractive potential 

LSAP Lipschitz simply attractive potential 

$y ISAP in which V Xiy = V for all adjacent pairs x, y G Z d 



Spaces of probability measures on configuration space 

set of probability measures on (Q, 5F) 

set of Gibbs measures on (f2, 3 r ), i.e., measures \i such that for all A CC 
Z d , < Z\{<p) < 00 /i-a.s. and /17A = /x 
set of probability measures on (f2, 5F T ) 
set of XL-invariant probability measures on (fl, 5F T ) 
set of gradient Gibbs measures on (Q,^), i.e., measures \i such that 
for all A CC Z d , < Z^(<p) < 00 /i-a.s. and /17A = \i 
set of probability measures on (f2, 5F T ) 
set of ^-invariant gradient Gibbs measures on (Q, 3 rT ) 
set of £-ergodic probability measures on (f2, 5F r ) 
set of extremal gradient Gibbs measures on (Q, 5F r ) 
set of £-ergodic gradient Gibbs measures on (Q, 3 rT ) 
measures, usually elements of 3 5 (Q, 3 rr ) 



T(fi, J T ) 

9(0,^), S T 

J>(fi, ^ r ) 

S < c(n,S T ) 
ex0 5 ^(Q,3 rr ) 
exS(fi,^) 
exS^ft,^) 

/i,Z/ 
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Relative entropy, free energy, and specific free energy 



3~C(/i, v) relative entropy of /x with respect to v 

v) relative entropy of /x with respect to v on sub cr-algebra A 

/j,a restriction of /x G 7c (fi, 3 rr ) to 3^ 

FE A (fi) free energy of /x G ^(Q, IF) in A, i.e., IK(/xa, e^A^I- 1 ) 

SFEa(^) specific free energy of /x in A, i.e., |A| _1 FPa(aO 

SFE(n) specific free energy of /x, i.e., lim^oo SFE\ n (/x) 



Slopes and surface tension 

S(/x) slope of /x (where /x G ^(fi, 3 rT )) 

■u slope variable (a linear function from R d to R m ) 

cr*, a surface tension, a{u) = mf{SFE®(fj,) : /x G ^(fi, 9 r ' r ), 5'(/x) = u] 

P($) pressure of $, P($) = inf{,SFP*(/i) : /i G J>c(fi, = inf ueKmxd <r(u) 

£/$ interior of set of slopes u with a(u) < oo 

Extremal and ergodic decompositions 

ca evaluation map /x — > /x^ 

e(x) smallest cr algebra on a subset x 01 ^(^j 90 or ^(^j 9 rT ) that makes 

ca measurable for each A G 3 r 

extremal decomposition of /x G S T , a measure on (exS T , e(exS T )) or 
ergodic decomposition of /x G T.c(f2, 3 rr ), a measure on 

(ex^(fi,^),e(ex^(Q,^))) 

limit of Gibbs rerandomizations 7a„(-|0) of on A n 
7r^ shift-averaged limit of Gibbs rerandomizations of <p 

Topologies on probability spaces 

space of probability measures on a measure space (X, X) 
smallest topology on "J > (X, X) in which v 1— > v{A) is continuous for every 
A G X 

smallest topology on T(X, X) in which v 1— > is continuous for every 
bounded continuous function /011I 

topology of local convergence on T , (f2,3 : " r ), i.e., smallest topology in 
which the maps /x t— > /x(/) are continuous for every bounded function 
/ : Q — > H. that is 3^- measurable for some A CC Z d 
basis for .A given by set of finite intersections of sets of the form {/x : 
/x(F) < e}, where F : f2 — > IR is bounded and ^-measurable for some 
A CC z d . 



T(X, X) 
r-topology 

weak topology 

A 
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Lattice approximations to continuous domains 

D n subset of 7L d that approximates nD (e.g., nD fl Z d ) 

D n simplex domain derived from D n 

<f) n a function from D n to E 

<p n piecewise linear interpolation of <j) n to D n . 

4> n rescaling of 0„ to D n given by (f)(r)) = ^(nn) 

Young functions 

A Young function, i.e., a convex, even function A : 1 h R+ U {00} 

for which A(0) = 0, A is finite on some open interval, and A is not 
identically zero 

Ad Sobolev conjugate of A in d dimensions 

A* sub-conjugate of A, i.e., any Young function increasing essentially more 

slowly near infinity than Ad 

A(v),veR d Eti A (vi) 

Orlicz-Sobolev spaces 

~D a domain in R d , usually a member of G(^-) (defined below) 

\D\ Lebesgue measure of D 

a a multi-index a = (a±, . . . , ad) with < on G 7h d 

D a distributional a derivative 

\\f\\ A , D , \\f\\A m£{k\f D A(t®-)dr,<l} 
L A (D), L A Orlicz space {/ : \ \f\\ A ,D < 00} 

W j ' A (D) Orlicz-Sobolev space {/ G L A (D) : D a f G L A (D) for < \a\ < j}. 

\\Vf\\ AD inf{fc|/ D ^(^M)^<l}. 

P(E; D) perimeter of E relative to D, i.e., total variation over D of the gradient 

of the characteristic function of E 
G(z) set of bounded domains {D C R n } for which there exists a constant C 

such that [min-fjE'l, \D— E\}] z < CP(E; D) for all Lebesgue measurable 

subsets E of D. 
L A (D) L A (D)U™ =1 L A (D n ) 

Empirical measures and large deviations 

a;GA n n^ $0x4>-> called the empirical measure of on A n , a 
member of 3>(Q, 3) 
5 n {0 : L n (<f>) G 5} where B £ H> 

CI {(f) : |[0(x) - 0(x o )] - [0 u (x) - u (x o )]| < e for all x G <9A„\{:r }, where 

e is fixed independently of n and <p u is plane of slope u 

PBL-M limsup^ - lAnl" 1 log (/ l^n^e'^W Y\ x eA n \{*o} 

PBL^fi) sup B3lX:Be vPBL u M 
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liming -|A n |-! log (/ l s „e-W) Yl xe A n \ {xo} 

^B3^BeH FBL B{v) 

/fl^.^mj'fn)^ 1 ' called the empirical profile measure of n and defined 
as an element of CP(Z? x fi), with cr-algebra understood to be Lebesgue 
measure times 3 T 

Gibbs measure p: n on (fi, 9^ ) defined by Gibbs potential H° Dn 
measure on y{D x fi) x L$ induced by \i n and the map 0„ — > (-R^ n , n , 0n) 
slope of the probability measure fi(D', ■)/ n(D' x fi), for /x G T(D x fi) 
rate function of large deviations principle satisfied by p n , given by 

' SFE(p(D, •)) — P($) fi) is Lebesgue measure on Z) and 

fj,(D', •) is ^-invariant when \D'\ > and 
S(im(x, •)) = V/(s) 

k oo otherwise 

the topology on y(D x fi) x (£>) for the large deviations principle, 
given by the product of (on the first coordinate) the smallest topology 
in which fi — * /i(-D' x /) is measurable for all rectangular subsets D' 
of D and bounded cylinder functions / and (on the second coordinate) 
the Lq (D) topology 



Cluster swapping and triplets 

set of edges of the lattice Z d 
E [0, oof" 

fi fi x fi x E 

5F (j-algebra generated by jF t x J" 7 " times the product topology on E 

$A(0i,02,r) $a(0i) + $a(0 2 ) + E e ^(e) 

02 5 ?") cluster swapping map defined on fi x O x E 

/i height offset variable for /i G ^(fi, 3^), i.e., a function tail- measurable, 

ii-a.s. finite function /cHhIU {°°} suc h that /i(0 + c) = h((f>) + c 
for all G fi, c G E and /i-a.s. /i(0) = h{6 v (p) + (w, i>) when v G £ and 

« - 

/i(/i) height offset spectrum of /j G J ) ^(fi,9 rr ), i.e., the law of h(<f>) modulo 

one, if is chosen from /i, viewed as a measure on [0, 1) 

S c set of all edges for which (0i + c, 02, r) is swappable 

T c + set of vertices v in infinite clusters of S c complement for which 02(f) > 

0i (v) + c throughout the cluster 

T c _ set of vertices v in infinite clusters of S c complement for which 02(f) < 

0i (v) + c throughout the cluster 

B + inf{c : T c + is empty} 



PBL(fi) 

FBL B {n) 

FBL(n) 

Pn 
Pn 

S(p(D',-)) 
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B~ sup{c : is empty} 

fi u minimal gradient phase of slope u (uniquely defined, under some con- 

ditions) 

fi Ui a extremal Gibbs measures, such that /i Uja a.s. h(<p) = a and \i u is the 

weighted average of fj, u<a where a is chosen from h(fi) 

Random subsets of Z 2 (Chapter EH only) 

f2r set of all of subsets F of Z 2 

jFr product a-algebra on fl r 

Pr a single infinite non-self-intersecting path forming the boundary of T 

(when such a path exists) 
Aoo the event that T and Z 2 \r are infinite connected sets 

A® the event that T = 

Ak event that A^ occurs and that both T fl and (7?\Ak) fl A k are 

non-empty 

A k assumed in this section to be shifted to be centered at the origin — i.e., 

A fc = [L-A;/2j,LA;/2-lj] 2 cZ 2 
B k event that there exists a path — consisting entirely of elements in T\Afc— 

which encircles the set A*. 
A(v) "shifted box" nv + A k 

A(v) nv + A k+1 

A(v) outer band of square faces around A(v) — i.e., the set of square faces 

of Z d that are incident to at least one vertex of A(v) and at least one 
vertex of A(v)\A(t> ) 

a n^A(^) 

A(v) event that Pr hits A v , and in between the first and last times Pr hits 

A(v), Pr hits no square which is fewer steps away from a A(w), with 
w 7^ v , than it is from A(v) 

v + , v_ given the event A(v), vertices such that A(y_) is the last band that the 

path hits before the first time it hits A(v), and A(v + ) is the first 
band that the path Pr hits after the last time it hits A(v ) 

C(v,w) event that some vertex incident to A(v) and some vertex incident to 

A(w) are in the same connected component of T\A 

p continuous interpolation of discrete path p 

C q (v, w) event that there exists a path p in T\A, connecting A(t>) and A(w) for 

which p is homotopically equivalent to q 
C£(v,w) event that w = v + and the path Pr from A(t>) to A(w) is homotopic 

to q 

C~(v,w) defined analogously using v + instead of t>_ 

B(v) event that there exists a cycle in T\A which disconnects A(v ) from 

infinity 
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W closed countably punctured plane R 2 \Z 2 U Z 2 x S 1 , a space homeomor- 

phic to IR 2 \[Z 2 + D where D is any disc of radius less than 1/2 
a = (ai,a 2 ) fixed point in M 2 \Z 2 with irrational coordinates 

r x closed line segment from a to the point x (including its limit point 

x x arg(a — x)) 

u x portion of same ray (from a through x to infinity) which lies between 

x and oo, together with its limit point x x arg(x — a) 

x homotopy class of a path which follows r x from a towards x, then makes 

a counterclockwise loop around x, and then returns to a along r x 

P X:V set of continuous paths from x to y in W 

r' x linear segment from the a to x x arg(a — x), followed by a counter- 

clockwise arc from x x arg(a — x) to x x 
p' + , p'_ right, left derivative of p 

P p C P X; y homotopy class 

p taut version of path p 

e 10- 10000 /i(^oo) 10000 

5 10- 100 V(^oo) 1000 
7 10- 10 V(^oo) 100 

(3 lo-iVKo) 10 
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